Tag: tools

Set default pipe/redirect encoding in Python

[via] Changing default encoding of Python? - StackOverflow

I ran into an issue using llm today where I was unable to save a response to a file using a pipe

llm llm logs -n 1 | Out-File response.txt
This would give me the error "UnicodeEncodeError: 'charmap' codec can't encode character '\u2192' in position 2831: character maps to <undefined>"

If you set the "PYTHONIOENCODING" environment variable to "utf8", it will fix the issue. This is because Python's default encoding is ASCII. Since the last response I got back from the model contained a non-ASCII character, this error was thrown.

So now, in my PowerShell profile, I've added a line to set the default to utf8, which fixes the issue.

$env:PYTHONIOENCODING = 'utf8'
# / 2025 / 04 / 21

Update all llm plugins

Quick one-liner to update all llm plugins using PowerShell:

llm plugins | ConvertFrom-Json | % { llm install -U $_.name }
# / 2025 / 04 / 21

FYI: Tracking down transitive dependencies in .NET

dotnet nuget why - command reference

I just found that there is a new(ish) command for figuring out where a transitive dependency comes from in your dotnet project (starting with dotnet 8.0.4xx)

dotnet nuget why <PROJECT|SOLUTION> <PACKAGE>
If you have a dependency in your project that has a vulnerability, you can use this to figure out which package is bringing it in. For example, System.Net.Http 4.3.0 has a high severity vulnerability. I've found instances where this package is brought into my projects by other packages. It's very handy to be able to trace it with a built-in tool. Before this was available, I would use the dotnet-depends tool, which is a great tool, but a little clunkier than I'd like, and doesn't seem to support central package management
# / 2025 / 04 / 18

LLM templates

david-jarman/llm-templates: LLM templates to share

Simon Willison's LLM tool now supports sharing and re-using prompt templates. This means you can create yaml prompt templates in GitHub and then consume them from anywhere using the syntax llm -t gh:{username}/{template-name}.

I have created my own repo where I will be uploading my prompt templates that I use. My most recent template that I've been getting value out of is "update-docs". I use this prompt/model combination to update documentation in my codebases after I've refactored code or added new functionality. The setup is that I use "files-to-prompt" to build the context of the codebase, including samples, then add a single markdown document that I want to be updated at the end. I've found that asking the AI to do too many things at once ends up with really bad results. I've also been playing around with different models. I haven't come to a conclusion on which is the absolute best for updating documentation, but so far o4-mini has given me better vibes than GPT 4.1.

Here is the one-liner command I use to update each document:

files-to-prompt -c -e cs -e md -e csproj --ignore "bin*" --ignore "obj*" /path/to/code /path/to/samples /path/to/doc.md | llm -t gh:david-jarman/update-docs
You can override the model in the llm call using "-m <model>"

llm -t gh:david-jarman/update-docs -m gemini-2.5-pro-exp-03-25
The next thing I'd like to tackle is creating a fragment provider for this scenario so I don't have to add so many paths to files-to-prompt. It's a bit clunky and I think it would be more elegant to just have a fragment provider that knows about my codebase structure and can bring in the samples and code without me needing to specify it each time.
# / 2025 / 04 / 18

Creating a markdown file from Microsoft Learn docs

MarkItDown - GitHub

I just learned about a new open-source tool from Microsoft called MarkItDown. 

MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines.
This seems similar to pandoc, but instead of any being able to take any formatted document type and convert it to any other type, it only outputs to markdown. It can be used as a standalone CLI tool or as a python library.

I'm particularly interested in converting HTML to markdown, so that I can take public documentation online and convert it into a markdown file, which can be more effectively consumed by LLMs. I was playing around with this idea last week during a hackathon, where I wanted to take the query language specification for WIQL that is online and turn it into a compact prompt, so the LLM can more reliably create WIQL queries for me.

To get the HTML for the web page, I use Simon Willison's tool shot-scraper to dump the HTML of the webpage, then pipe it into markitdown

shot-scraper html https://learn.microsoft.com/en-us/azure/devops/boards/queries/wiql-syntax | markitdown > wiql.md
This produces a file called wiql.md (link to gist with unmodified output). It's certainly not perfect, the first 300 lines (out of around 1000), are not related to the documentation, and is just extra HTML that isn't needed. This could probably be mitigated by passing an element selector to shot-scraper, so it doesn't dump the unrelated HTML of the page. But it's not hard to delete those lines manually, and then the final result is pretty good. It looks fairly similar to the original web page.

edit: Here is the one-liner to only dump the relevant part of the page.. You have to wrap the output of shot-scraper in a <html> so markitdown can infer the input type.

echo "<html>$(shot-scraper html https://learn.microsoft.com/en-us/azure/devops/boards/queries/wiql-syntax -s .content)</html>" | markitdown -o wiql.md
Side by side comparison
MarkItDown also supports plugins, so you can extend it to support other file formats. I've only played around with this a little bit, but I think it will be handy to have a quick and easy way to convert more documents to markdown. I'm particularly interested in the pdf and docx input types as well.
# / 2025 / 03 / 10