Something to spice up the api
violind4nc3r
Just wondering if we could possibly try for integration with headless ui maybe even selinium or anythingllm, to get device control that would work great with all the tools, templates and automations made, it would turn that into a brain for it honestly, even figuring how to add it to powerautomate or autokey would work
Log In
Narek Zograbian
Do you have any examples of implementations related to this feature request in other apps?
violind4nc3r
Narek Zograbian, I understand that Claude currently leverages their API to integrate AI into various platforms such as anythingllm with desktop control, including as a standalone platform on their own api on their own platform. Google has added vision capabilities, which GPT-4 also offers through OpenAI, in which taskade also uses as well. Incorporating a vision model or an IFTTT-style trainer for automating tasks with hotkeys or macros could enhance AI operations. This wouldn't necessarily require a headless Python or UI server unless the user was wanting to convert their ai agent to a jarvis-like or similar agent; typical hotkey training could enable GPT to perform actions. Alternatively, starting a foundation for vision on your platform could be beneficial. The focus could be on developing vision capabilities or hotkeys (macros), which GPT could support through a PyTorch base, Puppeteer, or privacy-focused training of agents for screen-based macros. This approach might expand features, especially for mobile actions and remote integration. Also if they dont want necessarily the screen vision then they can opt for the self trained macros(macros are everywhere, from razer in gaming, to autohot key, power automate, and tons other) just a cool suggestion.
breyden
Narek Zograbian screenpi.pe uses multimodal ingestion to create headless workflows from screen recordings, audio, and system logs. Their dev cycle isn't quite as fast as taskade's but it's already imrpvd quite a bit since I've started playing with it (couple months).
Narek Zograbian
violind4nc3r Thanks for the overview! We've actually discussed some of these things too. Currently, the accuracy and costs of using vision in tandem with puppeteer or other programs are not practical yet. As the technology improves, I think we'll look more into adding this functionality into Taskade.
Narek Zograbian
breyden This is definitely an interesting find! Thanks for sharing. I think we'll have to revisit some of these things in the near future for sure. Right now, I think vision + computer control isn't at that level of practicality yet.
violind4nc3r
Narek Zograbian sent you a message, some good ideas i thought may work lol