Imagine if you could run a powerful LLM model locally. This would enable you not need to be online all the time to get usefulness. Even more interesting is the idea that it could be more cost-effective - if you have the right hardware already you don’t have to pay for a host with premiums. In addition, you will own the data you send and receive from the model. It won’t be storing your questions somewhere you don’t allow it to, it won’t use your data to go train the next model.

Microsoft recently released Foundry Local which is a way to run the powerful Microsoft Foundry locally. This enables you to run OSS AI models available in Foundry also locally - which is super compelling.

I was able to get it up and running on my M1 Macbook. And it performs reasonably well. Definitely worth a look into for on device AI inference.