The Complete Beginners Guide To Autonomous Agents

Ok, let’s start with what you already know.

Artificial intelligence can be used to complete very specific tasks, such as recommending content, writing copy, answering questions, and even generating photographs indistinguishable from real life.

You tell the AI to complete the one task, it completes the one task. Simple.

But what if you don’t want to have to come up with all of the tasks for the AI to do? What if you want a teammate rather than just a tool? What if you want the AI to think for itself?

Like really think for itself.

Imagine you made an AI that you could give an objective to, even something as vague as “Create the best ice cream in the world”, and the AI would come up with a todo list, do the todos, add new todos based on it’s progress, and then continue this process until the objective was met.

This is exactly what “Autonomous Agents” do, and they are the fastest growing trend amongst AI developers, yet most people don’t know about them.

(At the time of writing this article, no major publications have written about autonomous agents, and since publishing, only a few have covered it, so if you’re reading this… you’re very early.)

What are autonomous agents? Why are they such a big opportunity? How do they work? What does this look like in the future? How can I build or use one? How can I meet other people interested in autonomous agents?

These are the questions I’m going to answer for you right now.

“[Intelligent] autonomous agents are the natural endpoint of automation in general. In principle, an agent could be used to automate any other process. Once these agents become highly sophisticated and reliable, it is easy to imagine an exponential growth in automation across fields and industries.”

p.s. I am CEO and co-founder of Octane AI, where for seven years we have been building conversational AI products, and are more recently building generative AI and autonomous agent solutions for brands. In 2016 I predicted that around now chatbot interfaces would take off and start to replace standard website UI, and now over 100 million people use ChatGPT and websites like it. I am now similarly predicting that autonomous agents will be widely adopted in the future, but this prediction won’t take seven years to come true, it will happen blazingly fast.

p.p.s. After writing this article I showed the draft to 125 of the smartest and most interesting people I know, including Emad Mostaque (Founder of Stability AI), Tony Hu (Former Acting Head of Emerging Technology for the FBI, and founder of Bondoo AI), Troy Carter (Lady Gaga’s ex Manager), Sahil Lavingia (Founder of Gumroad), Elizabeth Yin (Co-Founder of Hustlefund VC), Hugh Howey (Author of Wool), Chris Yeh (Author of Blitzscaling), experts from NVIDIA, Meta, investors like Ryan Hoover (creator of Product Hunt) and Erica Brescia (Manager Director of Redpoint Ventures, prior Github COO), and many many more. Their thoughts and opinions are sprinkled throughout, they will give you unique insights shared with the world for the first time.

What Are Autonomous Agents?

Autonomous agents are programs, powered by AI, that when given an objective are able to create tasks for themselves, complete tasks, create new tasks, reprioritize their task list, complete the new top task, and loop until their objective is reached.

Read that description above one more time, because while it is simple, it is wild.

“The future of autonomous agents looks like everybody becoming a manager.”

Yohei Nakajima, creator of BabyAGI

Autonomous agents can be designed to do any number of things, from managing a social media account, investing in the market, to coming up with the best children’s book.

“And these are, like, real? These exist right now?”

Yes, I know it sounds like science fiction, but these are functioning and real. If you can code you can make one in just a few minutes. And it is only the beginning.

“Humans waste inordinate amounts of time doing work that is tedious and manual when it could be done by computers and free them up for more creative pursuits, or to do things that only humans can currently do. Autonomous agents will enable people to get so much more done in so much less time, and – hopefully – spend much less time in front of screens over time!”

The programming techniques and the AI needed to power autonomous agents are real and extremely new. There are many open source projects, like AutoGPT, BabyAGI, and Microsoft’s Jarvis, that are trending on Github and within AI communities and departments.

In the first two weeks of the creation of open sourced autonomous agent code bases, almost 100,000 developers are building autonomous agents, improving them, and pushing them to their limits, and thats only in the first few weeks of these concepts being invented. The number of developers working with this technology is growing at an increasingly faster rate.

“AI agents will be everywhere. Billion-dollar companies will come from a small team that deploys ai agents.”

It has grown larger than long time popular codebases including laravel, bitcoin, django, and pytorch.

Auto-GPT Github popularity increasing exponentially, faster than any codebase in history

This is not science fiction. Many think these autonomous agents are the beginning of true Artificial General Intelligence, or commonly referred to as “AGI”, which is a term used to describe an AI that has gained sentience and become “alive”.

“Autonomous agents may end up commoditizing all applications of factual knowledge. If access to factual knowledge also becomes universal, then human qualities like creativity, emotion, and strategic vision will become even more distinctive. But it is also possible that knowledge becomes increasingly proprietary, as individuals and companies try to gain economic advantage in a world where applications of factual knowledge are commoditized, and the collective knowledge of humanity begins to stagnate.”

Check out this autonomous agent that was just released from HyperWrite, you can see it living in the browser and helping you order a pizza.

You just say “order a large plain pizza from Dominos to One Vanderbilt” and it just… does it.

HyperWrite’s autonomous agent controlling the browser to order pizza

Or, maybe even more impressive, check out this experiment done in collaboration between Stanford and Google where they created a virtual town of 25 autonomous agents, and told one of them to plan a Valentine’s day party.

The simulated people went about their days, talking to each other, forming new memories, and eventually most of them heard about, and showed up to, the Valentine’s day party.

From the research paper “Generative Agents: Interactive Simulacra of Human Behavior”

“Ok, uh, crazy… So autonomous agents are real… And you just tell it what it’s goal is and then after that it manages itself forever?”

You just give it the one objective, and the autonomous agent does the rest.

Just like a really good employee or teammate.

Although, if you wanted to, you could also design the autonomous agent to check in with you at certain key decision making moments so that you could momentarily collaborate on their work.

“It is "primitive AGI". It is remarkable that simply wrapping an LLM inside a loop gets you an autonomous agent that can reason, plan, think, remember, learn – all on its own. It demonstrates the untapped power and flexibility of what LLMs can do if wrapped in the right structures and prompts. The entire concept is less than month old so I can’t wait to see how increasingly sophisticated agents built off of increasingly more capable LLMs impact the world.”

“But what can autonomous agents do, Matt? Like when you say they complete tasks, what the heck do you mean by that?”

In addition to analyzing their objective, and coming up with tasks, autonomous agents can have a range of abilities that can enable them to complete any digital task a human could, such as:

Access to browsing the internet and using apps
Long-term and short-term memory
Control of your computer
Access to a credit card or other form of payment
Access to large language models (LLMs) like GPT for analysis, summarization, opinion, and answers.

Also, these autonomous agents will come in all shapes and sizes. Some will operate behind the scenes where the user is unaware of what they are doing, while some will be visible, like in the example above, where the user can follow along with each “thought” the AI has.

“Autonomous agents will allow everyone to live like a head of state! Need something done? Just ask, and your agents will take care of the rest. Never again will you have to waste brainpower on the routine or mundane.”

“Matt, I’m reading what you’re writing, I think I know what you are saying, but can you write out an example in plain english so I can be sure I understand.”

Here is a super simple example of how an autonomous agent could work.

Let’s say that there is an autonomous agent that helps with research, and we want a summary of the latest news about a certain topic, let’s say “News about Twitter”

We tell the agent “Your objective is to find out the recent news about Twitter and then send me a summary”.
So the agent looks at the objective, uses an AI like OpenAI’s GPT-4 which allows it to understand what it is reading, and it comes up with it’s first task. “Task: Search google for news related to Twitter”.
The agent then searches google for Twitter news, finds the top articles, and comes back with a list of links. The first task is complete.
Now the agent looks back at its main objective (to find out the recent news about Twitter and then send a summary) and at what it just completed (got a bunch of links of news about Twitter) and decides what its next tasks need to be.
It comes up with two new tasks. 1) Write a summary of the news. 2) Read the contents of the news links found via google.
Now the agent stops for a second before continuing, it needs to make sure that these tasks are in the right order. Should it really be writing the summary first? No, it determines that the top priority is to read the contents of the news links found via google.
The agent reads the content from the articles, and then once again comes back to the to do list. It thinks to add a new task to summarize the content but that task is already on the todo list so it doesn’t add it.
The agent checks the todo list, the only item left is to summarize the content it read, so it does that. It sends you the summary just like you asked.

Here is a diagram showing how this works.

From Yojei Nkajima’s BabyAGI

And keep in mind that this is the very beginning of this new paradigm. It’s not perfect, it hasn’t taken over the world yet, but the concept is frighteningly powerful and with increased development and experimentation will quickly find it’s way into our daily lives.

“This will soon transform many industries. It will be a lot easier for people to do many things at once with the use of Autonomous Agents. Just give it a task, and it will complete it. Such a powerful concept so far…”

So now you understand at a high level what an autonomous agent is, but why exactly are these such a big opportunity?

“If we’re able to get the information we need faster, will this allow us to free up time to dedicate to thinking and vs. doing? Will even better and more creative ideas surge as a consequence of investing less time on tasks that can be carried by this AI agent?”

Why Autonomous Agents Are Such A Big Opportunity

It’s pretty clear that soon you won’t only have the options of hiring humans as employees, you will have the ability to hire AIs in the form of autonomous agents.

“In the mid-term, I believe you’re going to see a huge rise in 1-2 people startups that use a combination of AutoGPTs and tools like ChatGPT. And they’ll be able to make the kind of progress you’d previously had expected from a 100 person startup. Long-term I believe that most work can and will be replaced by AutoGPTs.”

And they are not going to be nearly as expensive as people are, they won’t sleep, they won’t quit, and they will work extremely efficiently.

“Part of the thesis when I started Product Hunt in 2013 was a belief that the barrier to build software products would continue to lower, enabling smaller teams (or a single person) to build more and faster than ever before. This has never been more true today, accelerated by AI and autonomous agents. This introduces anxiety for some and opportunity for others that leverage this tech to scale their ideas with fewer people and capital required. In the end, consumers will greatly benefit through increased competition and experimentation of new solutions to their problems.”

These autonomous agents will exist in every industry and for every task imaginable.

These are just a handful of examples. Let your imagination run wild.

The list can go on and on. Anything a person could do, an autonomous agent will (eventually, but soon, and in some cases already) be able to do better.

“The music industry has imposed too many unnecessary layers that sit between an artist and success. Those layers cost an artist close to 35% of their net income. Autonomous Agents will be able to build and execute marketing strategies, engage with fans, build communities, route tours, book venues, and negotiate contracts. Saving the artist money and time.”

So what do you do with this information?

There are two very real opportunities.

You create autonomous agents and make them available for others to hire.
You hire autonomous agents and can now afford to be more productive in your personal life, or in business.

“Autonomous Agents are the next wave — not just in tech, but in business at large. I predict that within 10 years, there will be multiple billion-dollar companies run entirely by autonomous agents. It is inevitable.”

Imagine a world where one person builds a company with only autonomous agents on their team. Within your lifetime you will likely see a one person team do this and reach a market cap of over a billion dollars, something it usually takes many many people working together to accomplish.

“Personalization at scale is going to be a very interesting use case. You will be able to put on auto-pilot multi-step processes that humans do today that involves generating personalized images, videos, websites, emails or even calls at scale. One use case that has sparked a lot of interest is sales prospecting”

Right now in the early days there will be a period of time where early movers, either on making autonomous agents, or using them, will have a huge advantage against competition that is not yet leveraging these systems.

“In the near future, I expect to see lunch meetings, phone calls, and interviews appear on my calendar without my involvement. My agents and their agents will have made it happen, taking care of all the details. I just need to be there.”

By reading this article you are already ahead of 99% of the world.

Let’s dive into more detail on how these autonomous agents work.

“Autonomous agents have the potential to supercharge the output of smaller content creators and community members, especially those with creative imaginations. This will be a boon for many Web3 projects.”

How Autonomous Agents Work

You’ve already read over a high level of how autonomous agents work, but I thought it would be helpful to give you one version of an overall framework, as well as break down a couple examples of autonomous agents step by step.

“I see AI as a whole right now and we are in the building blocks that will evolve to become artificial intelligence assistants like we have seen in the movies — like Jarvis from Ironman or TARS from Interstellar.

Right now is a time to build out the frameworks because the AI itself is still improving. The answers might not be that good. It might have errors. But just looking at how much has improved with respect to AI in the last 6 months, I think we can barely imagine how things will be in the next 1-2 years. So this is about experimenting early, fast, and skating where the puck is heading.”

First, here a generalized framework for an autonomous agent:

Initialize Goal: Define the objective for the AI.
Task Creation: The AI checks its memory for the last X tasks completed (if any), and then uses it’s objective, and the context of it’s recently completed tasks, to generate a list of new tasks.
Task Execution: The AI executes the tasks autonomously.
Memory Storage: The task and executed results are stored in a vector database.
Feedback Collection: The AI collects feedback on the completed task, either in the form external data or internal dialogue from the AI. This feedback will be used to inform the next iteration of the Adaptive Process Loop.
New Task Generation: The AI generates new tasks based on the collected feedback and internal dialogue.
Task Prioritization: The AI reprioritizes the task list by reviewing it’s objective and looking at the last task completed.
Task Selection: The AI selects the top task from the prioritized list, and proceeds to execute them as described in step 3.
Iteration: The AI repeats steps 4 through 8 in a continuous loop, allowing the system to adapt and evolve based on new information, feedback, and changing requirements.

But, now lets apply it to a few different use cases I decided to extrapolate on.

“Autonomous agents are truly captivating to me because they embody the ultimate productivity booster. As someone who highly values automation for tedious or repetitive tasks, I find that these agents have the potential to revolutionize the way we work, allowing us to direct our mental energy towards more meaningful pursuits.”

Gabriel Menezes, Director of Engineering at Octane AI

Example #1: Social Media Manager Autonomous Agent

Let’s say that instead of hiring a social media manager to manage your social media accounts, instead you wanted an autonomous agent to do everything for you at a fraction of the cost and with round-the-clock intelligence.

“This is beyond just virtual assistants. This is a revolution in accelerating all work, research, and even play online. Anything you can do online that takes hours, days, months can now be completed in the background in minutes.”

Here’s what a framework for that autonomous agent might look like.

Initialize Goal: Set up the initial parameters, such as target audience, social media platforms, content categories, and posting frequency.
Data Collection: Collect data on past social media posts, user interactions, and platform-specific trends. This could include likes, shares, comments, and other engagement metrics.
Content Analysis: Analyze the collected data to identify patterns, popular topics, hashtags, and influencers relevant to your target audience. This step could involve natural language processing and machine learning techniques to understand the content and its context.
Content Creation: Based on the analysis, generate content ideas and create social media posts tailored to the platform and audience preferences. This could involve using AI-generated text, images, or videos, as well as incorporating user-generated content or curated content from other sources.
Scheduling: Determine the optimal time to post each piece of content based on platform-specific trends, audience activity, and desired frequency. Schedule the posts accordingly.
Performance Monitoring: Track the performance of each post in terms of engagement metrics, such as likes, shares, comments, and click-through rates. Gather user feedback, if possible, to further refine the understanding of audience preferences.
Iteration and Improvement: Analyze the performance data and user feedback to identify areas for improvement. Update the content strategy, creation, and scheduling processes to incorporate these insights. Iterate through steps 2–7 to continuously refine the social media management system and improve its effectiveness over time.

“People will own personal agents which communicate with agents owned by other people and businesses. Most computing devices will primarily serve as communication devices for speaking with agents.”

By incorporating this loop-type system in social media management, you can create a dynamic and adaptive strategy that evolves with your audience’s preferences and the constantly changing social media landscape. This will help to increase engagement, reach, and overall effectiveness of your social media efforts.

“Another use case for an autonomous agent that excites me is its application in the realm of music composition. By leveraging the power of AI-driven algorithms, these agents can analyze my personal preferences, favorite genres, and even specific musical elements that resonate with me. They can then generate original melodies, harmonies, and rhythms, effectively co-creating music alongside me. This creative collaboration has the potential to broaden my musical horizons, enabling me to explore new styles and genres I may not have considered before. Moreover, the autonomous agent can provide valuable feedback on my compositions and offer suggestions for improvement, nurturing my growth as a musician. The fusion of AI and human creativity in the music composition process can lead to innovative and unique results, expanding the boundaries of artistic expression.”

Example #2: Political Campaign Manager Autonomous Agent

What if you are running for political office and you want to leverage an intelligent and never-sleeping assistant to help you win?

“I’m excited about agents that do work that’s not necessarily hard to do but just require some time and effort for example things like booking flights I would love to outsource to an agent”

This is what an autonomous agent that helps you win an election might look like.

Initialize Goal: Win the election by securing the majority of votes.
Data Collection: Gather data on voters, demographics, key issues, campaign messaging, and other relevant information.
Context Analysis: Analyze the collected data to identify trends, opportunities, and challenges. Refine the initial goal into specific subgoals based on this analysis, such as targeting undecided voters, increasing voter turnout in key areas, or improving campaign messaging on particular issues.
Task Generation: Generate tasks related to the refined subgoals, such as planning voter outreach events, creating targeted advertisements, or developing policy proposals.
Task Prioritization: Rank tasks based on their potential impact on achieving the subgoals and the overall goal of winning the election.
Task Execution: Execute the highest priority tasks, allocating resources and assigning team members as needed.
Performance Monitoring: Assess the effectiveness of completed tasks by tracking key performance indicators like voter engagement, public opinion, and fundraising metrics. Evaluate the success of individual tasks and overall campaign progress toward the subgoals and initial goal.
Iteration and Improvement: Analyze the performance data to identify areas for improvement. Update the campaign strategy to incorporate these insights. Iterate through steps 2–8 to continuously refine the political campaign management system and improve its effectiveness over time.

“I’m most excited by the recursive self-cloning capability. The AI agent can create a copy of itself, pass on task directives, and start talking with its own sibling to get the job done. It is quite a remarkable but alien emergent ability.”

At first one candidate might use an autonomous agent and have a huge advantage over everyone, but then imagine what this looks like once every candidate has one… or many.

“I don’t think everyone will use autonomous agents. They will be everywhere but as AI becomes ubiquitous there will be a revival of 100% human work. Many people will rediscover pen and paper, want human only made art… We will see many products and creations that will advertise "only made by humans". It should become a very popular label very soon. The more technology grows the more I am enjoying myself long periods of completely offline time, soon also "off AI" time.”

Example #3: Math Tutor Autonomous Agent

Here is an autonomous agent that is designed to teach a child math.

“This is a breakthrough paradigm that has a LOT of room for exploration. Although early experiments have limited agents to search queries, we’re going to see a wide range of research and side projects arming autonomous agents with new batches of tools. Each set of tools will significantly expand the potential use cases.”

Initialize Goal: Identify the child’s current math skill level and set a personalized learning path to help them improve.
Data Collection: Gather information on the child’s learning style, progress, and performance through assessments, interactions, and feedback.
Context Analysis: Analyze the collected data to identify strengths, weaknesses, and learning preferences, as well as any external factors influencing the child’s progress.
Task Generation: Generate tutoring tasks based on the child’s needs and learning path, such as selecting appropriate exercises, providing explanations, or offering real-life examples and applications.
Task Prioritization: Rank tutoring tasks based on their potential impact on the child’s learning and skill development, ensuring a balance between challenge and engagement.
Task Execution: Execute the highest priority tasks, adapting the tutoring approach and content delivery as needed to maximize the child’s learning and engagement.
Performance Monitoring: Assess the effectiveness of the tutoring by tracking key performance indicators (KPIs) such as progress toward learning goals, improvement in math skills, and the child’s engagement and satisfaction.
Feedback Loop: Continuously monitor the child’s performance and update the context analysis, task generation, and task prioritization steps based on new data and insights. Adjust the initial goal and learning path as necessary to better support the child’s math skill development.
Iteration and Improvement: Analyze the child’s performance and update the context analysis, task generation, and task prioritization steps based on new data and insights. Adjust the initial goal and learning path as necessary to better support the child’s math skill development. Iterate through steps 2–9 to continuously refine the political campaign management system and improve its effectiveness over time.

This autonomous agent loop type system outlines a process for an educational math tutor to adaptively support and guide a child’s learning experience, focusing on continuous improvement and personalization based on the child’s needs and progress.

“Just like there will be numerous models of all sizes in the future, you’ll have multiple agents for different facets of your life: an agent for work, an agent for your family/home life, an agent for self-improvement, all working in tandem with other agents. Automating mundane tasks or giving you professional superpowers are the first obvious use cases, but your digital twin will be capable of so much – perhaps even going on dates without your involvement to assess fit, forever eliminating the bad first date.”

Vivian Cheng, Principal at CRV

The Future Of Autonomous Agents

Right now humanity is in the very beginning of developing autonomous agents. We’re poking around, breaking things, experimenting, making bad things, making good things.

“Autonomous agents will bring your ideas to life simply by requesting their assistance. These agents can serve as friends, colleagues, and collaborators, affording you an abundance of leisure time. I’m curious to know, how would you choose to spend this newfound freedom?”

Barely any commercialized products have even been released, everyone is still in development mode.

But soon, that is going to change. Autonomous agents are going to start showing up all over the place until one day it will be incredibly strange for someone to not have one, or multiple, autonomous agents helping them out at any given time.

“Rather than focus on replacing people’s work, focus on augmenting what they can do. Making something "smart" used to mean making its data available via api. The next generation of making something smart will be to ask how that product can better assist you. As an example, a "smart" email address might be able to take action in interesting ways based on your preferences. If you’re a big shopper, maybe it monitors emails for when an item you’re interested in goes on sale, price compares, or even negotiates price on your behalf, knowing privately to what degree you value the item and how much you’re willing to pay.”

People will move through life with autonomous agents of all kinds augmenting their movements, decisions, and actions. If at some point we have neural implants then this will all happen seamlessly just like thinking in your own head works today.

“Everyone will have access to a virtual researcher, assistant, writer, or worker at no or low cost. Access is democratized.”

Here are my predictions for the future of autonomous agents:

2023 multiple commercialized autonomous agents for gaming, personal use, marketing, and sales.
2024 commercialized autonomous agents for every category but not mainstream adoption.
2025 mainstream adoption of autonomous agents in every category for everything imaginable.
2026 most people in first-world countries are going about every day life with the support of an army of autonomous agents.

In the next 2-5 years most people will work for an autonomous agent instead of a human.

“I see using an augmented reality Holodeck, almost wholly driven by AIs, where lots of things are happening both automatically and with your manual prompting. Yes, people will work for the AIs. Everyone will use them, yes, but only a few will know what they are or how to make them. The world is about to change deeply because of LLMs and the coming autonomous agents and systems. LLMs (Large Language Models) are the most democratizing force humans have ever invented. Why? LLMs can now run on cheap computers without being connected to a central server. That little engine basically includes all human knowledge. Incredible that you can run that on something that isn’t connected to the Internet. Autonomous agents just make this Holodeck run almost automatically. Everything from weather to pizza delivery happening almost automatically with very little human input.”

“This is a lot to take in Matt, the future is going to be wild. Where can I start with autonomous agents today though?”

This is the best question to ask. I have all the resources you need.

“In this future, everyone will likely use autonomous agents in some capacity, whether for personal productivity, business operations, or creative endeavors. For the most part, people will serve as "maestros" to these AI agents, setting their goals and nudging them along. We will also "work for AI agents" in the same way that we must work within the constraints of companies, processes and other systems. However I think AI Agents will in many cases do a much better job than companies and systems in society do today, and will create opportunities that will benefit everyone on the whole.”

How To Build And Use Autonomous Agents

You are now ready to jump headfirst into the world of autonomous agents. I’m going to give you the resources you need to get started building or using autonomous agents on your own.

“Find a specific B2B use case with a lot of repetitive tasks. Sales ops. Ad ops. Event ops. Accounting ops. There are so many to choose from right now.”

I’m excited to see what you can do with this, and if you make something cool, I would love to check it out.

“First, narrow down your use case, as much as you can. Then, design a product that involves a human-in-the-loop, and a way to estimate the process’ success. And step-by-step increase automation. And only then expand to adjacent use cases.”

Building Autonomous Agents

You have a couple different options here.

Build It Yourself: Look at the framework I provided earlier and embark on a journey to build everything from scratch! You can definitely do this, it’s not a scary as it might sound. Some recommended software solutions are OpenAI’s GPT-4, Pinecone vector database, and LangChain’s framework.
Auto-GPT: This is a popular open source option created by Toran Richards. It includes options to connect to the internet, use apps, long-term and short-term memory, and more.
BabyAGI: Another popular open source option, this one created by Yohei Nakajima. While this one doesn’t connect to the internet yet, it is extremely elegant with under 200 lines of code.
Microsoft’s Jarvis: Very similar to Auto-GPT and BabyAGI, but much more robust and brought to you by Microsoft and HuggingFace.

“I think we’ll initially have vertical-specific autonomous agents that are fine-tuned on a certain set of data that allows them to take on a role in that field. The two (only?) areas of LLMs where we’ve seen big adoption so far is copywriting and programming. Extrapolating further, it makes sense to think that the AIs we have in those two spaces will start to become more autonomous. One way that could play out in the near future is that instead of the human giving a prompt to initialize the copy writing or the code completion, the AI autonomously gives you new suggestions each day for you to review, without you first having to start or prompt them.”

Using Autonomous Agents

Ready to have your own agent? Here are some options.

Spin up any of the options in the build your own section above!
AgentGPT: Create and run an autonomous agent (AutoGPT) from a website, no login required.
HyperWrite Assistant: Add a chrome extension that lets you give your browser commands and the browser follows through.

people from all walks of life can benefit from the expertise and efficiency previously reserved for the upper echelons of society. This democratization of personal assistance can lead to greater productivity and a more balanced work-life experience, empowering individuals to focus on their passions, creativity, and personal growth while their AI assistants take care of the more mundane aspects of their daily lives.”

Additional Resources:

No matter if you can code, or you don’t yet know how, I encourage you to take a few hours to experiment with these. It is not as complex or as difficult as it may seem and the quicker you get your hands dirty the faster you’re going to learn about autonomous agents.

“As an investor, using autonomous agents as to do the jobs of analysts and associates or at least super charge them really excites me. They could be programmed to source deals under certain conditions, analyze via certain factors and then tee up custom emails for me to send in order to start conversations.”

The autonomous agent landscape is wide open for interpretation and innovation. 99% of use cases have not been created or attempted, the possibilities are endless and the opportunity is yours for the taking.

“I’m very interested in the orchestration and modularization of smaller programming tasks towards a bigger end goal. We know LLMs are good at programming on a problem basis but we haven’t seen proof points that they could, for example, port an entire codebase from Android to iOS, or even create an app from scratch. I suspect an agent with the right orchestration scheme and memory structure could make this happen.”

This space is moving incredibly fast, faster than anything I have ever seen before. Every hour it feels like there is new information, new experiments, and new releases.

So how do you keep up with it all?

I got you covered. Come with me.

How To Meet People Interested In Autonomous Agents

You are only at the beginning of your autonomous agents journey, and I know you are still burning with questions and ideas you want to share.

If you’re sitting there thinking any of the following then I have the perfect solutions for you:

“I wish I could stay up to date on new developments in autonomous agents”
“I have an idea for an autonomous agent, I want to share it with someone and see what they think!”
“I built an autonomous agent, I would love to share it with people!”
“I want to invest in people building autonomous agents”

If this sounds like you, and your autonomous agent curiosity has been sparked, here’s what you should do next.

For example when I talked about autonomous agents to Emad Mostaque, the founder and CEO of Stability AI, his response was a coy “Swarm intelligence will beat AGI.” What does he mean by that? Subscribe to my newsletter and we’ll explore it deeper.

The world is changing fast and I am so excited to dive headfirst with you into merging humanity with artificial intelligence.

Build something people want. Try not to destroy the world on accident. I’ll talk to you soon.

p.s. Want to chat? I’d love to hear from you. Reach out on Twitter @MattPRD or send me an email at matt at mattprd dot com.

from Matt Schlicht’s AI Newsletter https://www.mattprd.com/p/the-complete-beginners-guide-to-autonomous-agents

Articles

All JavaScript and TypeScript features of the last 3 years

This article goes through almost all of the changes of the last 3 years (and some from earlier) in JavaScript / ECMAScript and TypeScript.

from Sidebar https://medium.com/@LinusSchlumberger/all-javascript-and-typescript-features-of-the-last-3-years-629c57e73e42

Articles

Going algorithmic

“We are now in transition from an object-oriented to a systems-oriented culture. Here change emanates, not from things, but from the way…

Continue reading on Bye bye, Bauhaus. »

from Design Systems on Medium https://medium.com/bye-bye-bauhaus/going-algorithmic-f97e51d0f262

Articles

The Market for Lemons

For most of the past decade, I have spent a considerable fraction of my professional life consulting with teams building on the web.

It is not going well.

Not only are new services being built to a self-defeatingly low UX and performance standard, existing experiences are pervasively re-developed on unspeakably slow, JS-taxed stacks. At a business level, this is a disaster, raising the question: “why are new teams buying into stacks that have failed so often before?”

In other words, “why is this market so inefficient?”

George Akerlof’s most famous paper introduced economists to the idea that information asymmetries distort markets and reduce the quality of goods because sellers with more information can pass off low-quality items as more valuable than informed buyers appraise them to be. (PDF, summary)

Customers that can’t assess the quality of products pay the wrong amount for them, creating a disincentive for high-quality products to emerge and working against their success when they do. For many years, this effect has dominated the frontend technology market. Partisans for slow, complex frameworks have successfully marketed lemons as the hot new thing, despite the pervasive failures in their wake, crowding out higher-quality options in the process.

These technologies were initially pitched on the back of “better user experiences”, but have utterly failed to deliver on that promise outside of the high-management-maturity organisations in which they were born. Transplanted into the wider web, these new stacks have proven to be expensive duds.

The complexity merchants knew their environments weren’t typical, but they sold highly specialised tools as though they were generally appropriate. They understood that most websites lack tight latency budgeting, dedicated performance teams, hawkish management reviews, ship gates to prevent regressions, and end-to-end measurements of critical user journeys. They understood the only way to scale JS-driven frontends are massive investments in controlling complexity, but warned none of their customers.

They also knew that their choices were hard to replicate. Few can afford to build and maintain 3+ versions of the same web app (“desktop”, “mobile”, and “lite”), and vanishingly few scaled sites feature long sessions and login-gated content.

Armed with all of this background and knowledge, they kept the caveats to themselves.

What Did They Know And When Did They Know It? #

This information asymmetry persists; the worst actors still haven’t levelled with their communities about what it takes to operate complex JS stacks at scale. They did not signpost the delicate balance of engineering constraints that allowed their products to adopt this new, slow, and complicated tech. Why? For the same reason used car dealers don’t talk up average monthly repair costs.

The market for lemons depends on customers having less information than those selling shoddy products. Some who hyped these stacks early on were earnestly ignorant, which is forgivable when recognition of error leads to changes in behaviour. But that’s not what the most popular frameworks of the last decade did.

As time passed, and the results continued to underwhelm, an initial lack of clarity was revealed to be intentional omission. These omissions have been material to both users and developers. Extensive evidence of these failures was provided directly to their marketeers, often by me. At some point (certainly by 2017) the omissions veered into intentional prevarication.

Faced with the dawning realisation that this tech mostly made things worse, not better, the JS-industrial-complex pulled an Exxon.

They could have copped to an honest error, admitted that these technologies require vast infrastructure operate; that they are unscalable in the hands of all but the most sophisticated teams. They did the opposite, doubling down, breathlessly announcing vapourware year after year to forestall critical thinking about fundamental design flaws. They also worked behind the scenes to marginalise those who pointed out the disturbing results and extraordinary costs.

Credit where it’s due, the complexity merchants have been incredibly effective in one regard: top-shelf marketing discipline.

Over the last ten years, they have worked overtime to make frontend an evidence-free zone. The hucksters knew that discussions about performance tradeoffs would not end with teams investing more in their technology, so boosterism and misdirection were aggressively substituted for evidence and debate. Like a curtain of Halon descending to put out the fire of engineering debate, they blanketed the discourse with toxic positivity. Those who dared speak up were branded “negative” and “haters”, no matter how much data they lugged in tow.

Sandy Foundations #

It was, of course, bullshit.

Astonishingly, gobsmackingly effective bullshit, but nonsense nonetheless. There was a point to it, though. Playing for time allowed the bullshitters to punt introspection of the always-wrong assumptions they’d built their entire technical ediface on. In time, these misapprehensions would become cursed articles of faith:

CPUs get faster every year
[ narrator: they do not ]
Organisations can manage these complex stacks
[ narrator: they cannot ]

All of this was falsified by 2016, but nobody wanted to turn on the house lights while the JS party was in full swing. Not the developers being showered with shiny tools and boffo praise for replacing “legacy” HTML and CSS that performed fine. Not the scoundrels peddling foul JavaScript elixirs and potions. Not the managers that craved a check to write and a rewrite to take credit for in lieu of critical thinking about user needs and market research.

Consider the narrative Crazy Ivans that led to this point.

By 2013 the trashfuture was here, just not evenly distributed yet. Undeterred, the complexity merchants spent a decade selling <a href='/2022/12/performance-baseline-2023/'>inequality-exascerbating technology</a> as a cure-all tonic. — By 2013 the trashfuture was here, just not evenly distributed yet. Undeterred, the complexity merchants spent a decade selling inequality-exascerbating technology as a cure-all tonic.

It’s challenging to summarise a vast discourse over the span of a decade, particularly one as dense with jargon and acronyms as the one that led to today’s status quo of overpriced failure. These are not quotes, but vignettes of distinct epochs in our tortured journey:

“SPAs are a better user experience, and managing state is a big problem on the client side. You’ll need a tool to help structure that complexity when rendering on the client side, and our framework works at scale”
[ illustrative example ]
“Instead of waiting on the JavaScript that will absolutely deliver a superior SPA experience…someday…why not render on the server as well, so that there’s something for the user to look at while they wait for our awesome and totally scalable JavaScript to collect its thoughts?”
[ an intro to “isomorphic javascript”, a.k.a. “Server-Side Rendering”, a.k.a. “SSR” ]
“SPAs are a better experience, but everyone knows you’ll need to do all the work twice because SSR makes that better experience minimally usable. But even with SSR, you might be sending so much JS that things feel bad. So give us credit for a promise of vapourware for delay-loading parts of your JS.”
[ impressive stage management ]
“SPAs are a better experience. SSR is vital because SPAs take a long time to start up, and you aren’t using our vapourware to split your code effectively. As a result, the main thread is often locked up, which could be bad?
Anyway, this is totally your fault and not the predictable result of us failing to advise you about the controls and budgets we found necessary to scale JS in our environment. Regardless, we see that you lock up main threads for seconds when using our slow system, so in a few years we’ll create a parallel scheduler that will break up the work transparently*”

[ 2017’s beautiful overview of a cursed errand and 2018’s breathless re-brand ]
“The scheduler isn’t ready, but thanks for your patience; here’s a new way to spell your component that introduces new timing issues but doesn’t address the fact that our system is incredibly slow, built for browsers you no longer support, and that CPUs are not getting faster”
[ representative pitch ]
“Now that you’re ‘SSR’ing your SPA and have re-spelt all of your components, and given that the scheduler hasn’t fixed things and CPUs haven’t gotten faster, why not skip SPAs and settle for progressive enhancement of sections of a document?”
[ “islands”, “server components”, etc. ]

The Steamed Hams of technology pitches.

Like Chalmers, many teams and managers acquiesce to the contradictions embedded in the stacked rationalisations. Dozens of reasons to look the other way were invented, from the marginal to the imaginary.

But even as the complexity merchant’s well-intentioned victims merchants meekly recite the koans of trickle-down UX — it can work this time, if only we try it hard enough! — the evidence mounts that “modern” web development is, in the main, an expensive failure.

The baroque and insular terminology of the in-group is a clue. It’s functional purpose (outside of signaling) is to obscure furious plate spinning. This tech isn’t working for most adopters, but admitting as much would shrink the market for lemons.

You’d be forgiven for thinking the verbiage was designed obfuscate. Little comfort, then, that folks selling new approaches must now wade through waist-deep jargon excrement to argue for the next increment of complexity.

The most recent turn is as predictable as it is bilious. Today’s most successful complexity merchants have never backed down, never apologised, and never come clean about what they knew about the level of expense involved in keeping SPA-oriented technologies in check. But they expect you’ll follow them down the next dark alley anyway:

And why not? The industry has been down to clown for so long it’s hard to get in the door if you aren’t wearing a red nose.

The substitution of heroic developer narratives for user success happened imperceptibly. Admitting it was a mistake would embarrass the good and the great alike. Once the lemon sellers embed the data-light idea that improved “Developer Experience” (“DX”) leads to better user outcomes, improving “DX” became and end unto itself. Many who knew better felt forced to play along.

The long lead time for falsifying trickle-down UX was a feature, not a bug; they don’t need you to succeed, only to keep buying.

As marketing goes, the “DX” bait-and-switch is brilliant, but the tech isn’t delivering for anyone but developers. The goal of the complexity merchants is to put your brand on their marketing page and showcase microsite and to make acqui-hiring your failing startup easier.

Denouement #

After more than a decade of JS hot air, the framework-centric pitch is still phrased in speculative terms because there’s no there there. The complexity merchants can’t cop to the fact that management competence and lower complexity — not baroque technology — are determinative of end-user success.

By turns, the simmering embarrassment of a widespread failure of technology-first approaches has created new pressures that have forced the JS colporteurs into a simulated annealing process. In each iteration, they must accept a smaller and smaller rhetorical lane as their sales grow, but the user outcomes fail to improve.

The excuses are running out.

At long last, the journey has culminated with the rollout of Core Web Vitals. It finally provides an effortless, objective quality measurement prospective customers can use to assess frontend architectures. It’s no coincidence the final turn away from the SPA justification has happened just as buyers can see a linkage between the stacks they’ve bought and the monetary outcomes they already value, namely SEO. The objective buyer, circa 2023, will understand heavy JS stacks as a regrettable legacy, one that teams who have hollowed out their HTML and CSS skill bases will pay dearly for in years to come.

No doubt, many folks who now know their web stacks are slow and outdated will do as Akerlof predicts, and work to obfuscate that reality for managers and customers for as long as possible. The market for lemons is, indeed, mostly a resale market, and the excesses of our lost decade will not be flushed from the ecosystem quickly. Beware tools pitching “100 on Lighthouse” without checking the real-world Core Web Vitals results.

Shrinkage #

A subtle aspect of Akerlof’s theory is that markets in which lemons dominate eventually shrink. I’ve warned for years that the mobile web is under threat from within, and the depressing data I’ve cited about users moving to apps and away from terrible web experiences is in complete alignment with the theory.

More prosaically, when websites feel like worse experiences to those who greenlight digital services, why should anyone expect them to spend a lot to build a website? And when websites stop being where most of the information and services are, who will hire web developers?

The lost decade we’ve suffered at the hands of lemon purveyors isn’t just a local product travesty; it’s also an ecosystem-level risk. Forget AI putting web developers out of jobs; JS-heavy web stacks have been shrinking the future market for your services for years.

As Stigliz memorably quipped:

Adam Smith’s invisible hand — the idea that free markets lead to efficiency as if guided by unseen forces — is invisible, at least in part, because it is not there.

But dreams die hard.

I’m already hearing laments from folks who have been responsible citizens of framework-landia lo these many years. Oppressed as they were by the lemon vendors, they worry about babies being throw out with the bathwater, and I empathise. But for the sake of users, and for the new opportunities for the web that will open up when experiences finally improve, I say “chuck those tubs”. Chuck ’em hard, and post the photos of the unrepentant bastards that sold this nonsense behind the cash register.

We lost a decade to smooth talkers and hollow marketeering; folks who failed the most basic test of intellectual honesty: signposting known unknowns. Instead of engaging honestly with the emerging evidence, they sold lemons and shrunk the market for better solutions. Furiously playing catch-up to stay one step ahead of market rejection, frontend’s anguished, belated return to quality has been hindered at every step by those who would stand to lose if their false premises and hollow promises were to be fully re-evaluated.

Toxic mimicry and recalcitrant ignorance must not be rewarded.

Vendor’s random walk through frontend choices may eventually lead them to be right twice a day, but that’s not a reason to keep following their lead. No, we need to move our attention back to the folks that have been right all along. The people who never gave up on semantic markup, CSS, and progressive enhancement for most sites. The people who, when slinging JS, have treated it as special occasion food. The tools and communities whose culture puts the user ahead of the developer and hold evidence of doing better for users in the highest regard.

It’s not healing, and it won’t be enough to nurse the web back to health, but tossing the Vercels and the Facebooks out of polite conversation is, at least, a start.

from Sidebar https://infrequently.org/2023/02/the-market-for-lemons/

Articles

A Field Guide to AI in the Metaverse

By 2030, each of these technologies; AI, XR & Blockchain, will be fully integrated into the Metaverse and each will create massive value for businesses and consumers alike. Learning about and leveraging these new tools will allow the Metaverse to be created not just by programmers, developers, and 3D artists, but by everyone. (keep reading to make your own!)

“With AI in the Metaverse, everyone will be a creator.”

This article will cover Artificial Intelligence exclusively and its importance to the future of the Metaverse.

Generative AI (Text, Audio & Image)
NeRF — 3D spatial capture
Computer Vision & SLAM
Natural Language Processing & Conversational AI
Automatic Content Creation (3D)

AI in the Metaverse holds the power to unleash unlimited creativity while ensuring everyone has equal opportunities. Many will see these technologies as a replacement for human labor, and for some roles, this will certainly be true, but more likely we will adapt to doing much more with much less, which will be required as we enter the exponential age of humanity. With Generative AI, the biggest thing to note is that while the neural networks they use to create novel content are trained on open data sets scraped from the internet. The work they create is not derivative but original. Every piece of content they generate, whether audio, text, video, or images is a novel creation based on billions of training data points scraped from the internet.

“We are entering the exponential age of humanity”

Before you continue this article I want you to understand two things;

AI Changes Everything.
AI is Already Here and it’s not going away.

‘AI has Huge implications in the Metaverse’ — TIME

The Metaverse consists of the collection of media including video, audio, and text that we see in the current iteration of the internet plus three groups of technologies; AI, XR, and Blockchain. If for no other reason than that Ryan Reynolds is already using AI and incredible art like the video above is being made, you should be paying attention.

Generative AI (Text & Image)

Let’s start with the most common and understood; Generative AI Interfaces based on GPT (Generative Pre-Trained Transformer) algorithms, the most well-known being ChatGPT. These generative AI models use massive datasets and scrape the internet for data. Based on simple text input, these AI platforms can create incredibly valuable responses that can be used for:

Search: AI-powered insights. Google AI, ChatGPT, OpenAI
Text: Summarizing or automating content. GPT3/4, ChatGPT, Open AI
Images: Generating images. Midjourney, DALL-E, Stable Diffusion
Audio: Summarizing, generating or converting text in audio. Play.ht, Clipchamp, Soundraw
Video: Generating or editing videos. Synthesia, VEED.io
Code: Generating code: ChatGPT, GitHub Co-Pilot, IntelliCode, PyCharm, Jedi
Chatbots: Automating customer service and more. Zendesk, Ada, DeepConverse
Natural Language Processing (NLP): InWorldAI, Synthesia, MindMeld
Computer Vision: HawkEye, VisoAI, DeepMind, SenseTime
Simultaneous Location & Mapping (SLAM): Apple, PTC, Snap, Niantic, Meta
Machine Learning (ML): NVIDIA, Microsoft, iTechArt, Meta
Suggestion Algorithms: Google, Amazon, Microsoft, Netflix

Let’s do a text one together

STEP 1: Go to chat.openai.com/chat— wait for a free server

STEP 2: Enter Prompt — ‘Write a fun dad joke about AI.’

OUTPUT: Why was the AI feeling cold? Because it left its algorithm open!

STEP 3: Laugh — Either at how dumb the joke is or how amazing how instant the response was, but either way, it truly is amazing.

STEP 4: Try a bunch of work-related tasks you need done asap. (ie. Write an article about…Give 10 examples of….Write a marketing strategy for…)

Let’s make an image using Midjourney

STEP 1: Register at Midjourney

STEP 2: Get on the Midjourney Discord Server

STEP 3: Find a room to submit your query.

STEP 4: Prompt the following — /Imagine Dragon hanging on a castle high resolution photoreal, fire breathing — ar 3:2

NOTE: Imagine is required to start the prompt (not part of the band)

STEP 5: Choose the one you like and click V# you want to see 4 more versions

STEP 6: Choose the best one and click U# to upscale the image

OUTPUT:

Here is a little more reading for a deeper understanding of Who Owns the Generative AI Platform? from a16z and McKinsey’s “What is Generative AI?” You can also read this PBS Special on How AI Turns Texts into Images.

Without getting too philosophical on this subject, generative AI holds the potential to fundamentally change the fabric of society. Imagine when AI not only defends you in court, but also drafts (and passes) laws. AI is already being used by governments to decide who gets welfare and who doesn’t…and many times, it gets it wrong! Imagine your grandmother being denied medical coverage because an algorithm decided she was not worth saving. What other business models and social constructs will be upended? If everyone uses AI to create content, there will be unintended consequences, but the value this nascent technology will create cannot be overstated.

NeRFs (Neural Radiance Fields)

Let’s move to another subset of AI known as NeRFs, not related to the foam missiles you fire at your younger brother, but Neural Radiance Fields, a complex field of study that uses computer vision from a regular RGB camera to capture video and translate it into volumetric 3D renders you can import into 3D platforms and view spatially. NeRFs are not just a better way to turn scans of real-world places into 3D orders of magnitude faster than current LiDAR solutions at a fraction of the cost by using a smartphone camera vs. $50–100K scanner. AI also takes the information and fills in the blanks to create a realistic and believable virtual version of physical space. These virtual models of the real world will help us populate spaces in the Metaverse quickly and easily, making everyone a creator.

These digital replicas of the real world will help us build shared spaces in the metaverse quickly and easily, extending real-life social networks and accelerating mainstream adoption.

For those like me who don’t understand the above diagram, there is a more simplified explanation here: “NeRF or better known as Neural Radiance Fields is a state-of-the-art method that generates novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. The input can be provided as a Blender model or a static set of images. Basically, wave your phone around and voila, you have a 3D volumetric capture (or at least that is the promise).

NVIDIA getting started with NeRFs guide (for advanced programmers)

Note: There are no really easy ways to do this currently, but if you want to go deep, here is a video that explains how to make a NeRF in the easiest way I have found thus far (warning, it’s hard!).

You can also download a program called Polycam 3D for iPhone or Android and start 3D scanning objects and/or scenes for use in platforms such as MetaVRse or Unity.

Computer Vision & SLAM

Computer vision (CV) is the field of computer science that focuses on replicating the complexity of the human visual system and enabling computers to identify and process objects in images and videos in the same way that humans do. Imagine how autonomous cars see and how VR headsets understand what is around you.

Simultaneous Location and Mapping (SLAM) is a form of computer vision that allows your phone to map and understand your surroundings in order to display 3D content in your space. Built into your mobile device are several sensors (Accelerometer, Gyroscope, LiDAR scanner) that, in addition to what the RGB cameras see, provide context in terms of position in the X,Y,Z or 6-Degree of Freedom (6-DOF) space. This allows your phone to understand where the floor is and simultaneously project content into augmented reality.

As CV technology continues to advance, the possibilities will expand from autonomous vehicles, robots, and drones to augmented reality that looks as real as real.

Some of the capabilities of CV and SLAM include object recognition and tracking (think tracking a real-world object while projecting digital information on top of it).

Natural Language Processing & Conversational AI

Natural Language Processing (NLP) is a field of AI that focuses on the interaction between computers and human language. It involves the use of algorithms and statistical models to analyze, understand, and generate human language. NLP is used in a wide range of applications such as language translation, text-to-speech, sentiment analysis, and more.

Conversational AI is a subfield of NLP that focuses on creating human-like interactions between computers and humans using natural language. This can include chatbots, virtual assistants and voice assistants. The goal of conversational AI is to create a seamless and natural communication experience for users. This can be achieved through the use of advanced NLP techniques such as natural language understanding and generation, as well as machine learning and deep learning.

Automatic Content Creation

Nothing says AI like automation. These tools allow you to say what it is you want to create and voila, it is there, in 3D! While there will be a ton of these tools in the Metaverse, this is the first one that we know of that works. This slideshow will give you a much deeper understanding of how this technology will revolutionize gaming. Even music is being created by AI now. Give it a try yourself at Anything World.

Check out this cool 3D object created completely by AI on LumaLabs.

To learn more about new cutting-edge technologies like GET3D from NVIDIA, Make-a-Video from Meta, and DreamFusion from Google, follow Two Minute Papers on YouTube.

As you can see, this is the future and while it is not quite ready for prime time, researchers are using AI to solve for AI so it won’t be long before this becomes how we build every virtual world in the Metaverse.

Generative AI Startup Landscape:

Well, there you have it, a pretty comprehensive look at the artificial intelligence algorithms that will directly impact and hopefully benefit you in the Metaverse.

Alan Smithson is co-founder of MetaVRse.

Header image credit: Midjourney

How to ask questions like a UX Researcher

Tips for asking good, engaging, and productive questions

Two people sitting at the table, one person asking question to the other — Stock Image from Pexels Contributor Alex Green

As researchers we love a good question, but what’s the craft behind asking questions? In short: beware leading questions, ask one question at a time, and manage the flow. And as a last resort, ask less and listen more.

I mean, everyone asks questions every day, right? So who doesn’t know how to ask questions? Do you ask clear and insightful questions? Perhaps most importantly, do you want to learn how to ask good questions?

Well, actually, these are all horrible questions. They are leading. They come off as a bombardment. And they create zero space for nuanced discussions.

Asking questions is a craft that is at the heart of what UX Researchers do. It is complex, and comes with a rich and eclectic literature inspired by sociology, psychology, and other social sciences. But asking questions can also be very easy. Just don’t be the dummy that wrote the first paragraph and instead avoid the 3 common traps of asking questions.

Tip no. 1: Avoid leading questions at all cost

An attorney standing up in court, declaring “objection, leading the witness” — The lawyers know it!

A leading question suggests the answer to be given, and makes you feel pressured to answer a certain way.

Much like in a court room, we want the truth the whole truth and nothing but the truth from user interviews. However, the challenge is that, as international super star Lizzo sagely observed, “Truth Hurts.” Honest feedback can sting. And with a natural inclination for psychological safety, any reasonable person under pressure would say whatever they believed the listener wanted to hear, even at the expense of truth. So without a mandate or an oath, how might we get honest reactions and answers?

The key is to avoid leading questions. Let go. Play dumb. Be curious. Judge none. Any of these strategies would help create a safe environment where the participant might feel secure enough to disclose their deepest and darkest secrets. And sometimes, when an interview goes really well, participants would share their sharpest stretch of mind which would surely shatter any preconceived assumptions.

Imagine how you would respond to the opening question: “Everyone asks questions every day, right?” You’re sitting in front of a computer, talking to the interviewer for the first time via Zoom. No clue about the interviewer’s character other than their interesting fashion choice. With such a leading question, would you challenge the interviewer and confess the truth? Something along the lines of: “No actually, your premise for the question is miserably wrong. Yesterday, I didn’t speak to a living soul the entire day. And let me tell you, it was a delight. So no, I don’t ask questions every day; nor do I want to.”

That’s why it’s important to avoid leading questions — to allow participants to speak their mind. According to sociologist Robert Weiss whose book Learning from Strangers sits atop the syllabus of virtually every PhD-level Qualitative Methods Seminar, “the interview relationship is a research partnership between the interviewer and the respondent.“ And like any well-functioning relationship, there shall be no strong-arming of opinions.

Interested to read more? Check these out:

Survey Monkey: How to avoid asking leading questions and loaded questions
Nielsen Norman Group: Avoid Leading Questions to Get Better Insights from Participants

Tip no. 2: Avoid double-, triple-, quadruple- barreled questions

Interviewer saying “I don’t remember the question” while shaking her head — What was the other part of your question again?

A double-barreled (compound) question comes with multiple parts, and expects you to give detailed responses while remembering the many parts of the question.

And it’s just too hard. You’re staring at the camera above the screen, trying to give the interviewer your undivided attention, and in exchange, you receive a barrage of questions too many to count. “Ugh, I guess. I mean, it’s probably fair to say everyone asks questions, right, though I didn’t say a single word yesterday. But regardless, by extension of the premise, everyone should have a little experience with asking questions. And… and.. what was your question again?”

The problem with asking compound questions is that the pace of query far exceeds the regular processing time of the human mind. Unless you have mastered the craft of human parallel computing, answering many questions at a time is outright overwhelming. You’re faced with a tradeoff, between developing a full answer to the first question and zipping through every question to check off the list. Neither is optimal. Especially when you respond to a multi-part question with a multi-part answer, it leaves no room for the researcher to process, unpack, and dig deeper.

In the worst case, double-barreled questions can lead to inaccurate results. Take the third question in the opening paragraph as example: “Do you ask clear and insightful questions?” How would you respond? What if you’re good at making yourself understood but struggle to navigate the domain context? When I first started at Stavvy, I felt confident about asking questions clearly, but little did I know about the real estate industry. So to the double-barreled question, I would probably say, “Um.. yes?” But that can’t be further from the truth.

By splitting up many queries into singular questions, we ensure that each question is addressed with the adequate time it deserves, and as a result get accurate answers from interviews.

Interested to read more? Check these out:

Qualtrics: The dreaded double-barreled question & how to avoid it in research
Dresden University of Technology: (PDF) Double Barreled Questions: An Analysis of the Similarity of Elements and Effects on Measurement Quality

Tip no. 3: Avoid rigidly going down a list of questions

Housewife Lisa looking up saying “This is so awkward” — When the interviewer reads off of a list of questions

Let’s talk about structure and flow. Ideally, an interview should be a conversation, not an interrogation. As research partners, the interviewer and the participant should move from one topic to another as they both see fit.

There is a spectrum when it comes to interview structure. On one end, there is “structured interview,” where each participant is asked exactly the same questions in the same order with no spontaneous follow-ups. On the other end, there is “unstructured interview,” where everything is free-form, and the interviewer might ask drastically different questions from one participant to another.

While there is a specific use case for structured interviews, the rigid format has many limits. Most prominently, it relinquishes the power of conversation and instead acts as a verbal survey. Not only does the format produce awkward transitions between questions, it also fails to explore the exciting ideas brought up by the participant. This type of interview is a one-way street, not a two-way collaboration.

Similarly, unstructured interviews have their faults, too. While the conversation might flow, the data collected from such sessions will likely vary in its topic coverage, making the analysis challenging, if not impossible.

So what UX Researchers tend to prefer is a format somewhere in between: semi-structured interviews. Have a list of topics to explore, but follow the lead of the participant. Make questions increasingly specific as you dig deeper. I think of the process as exploring a cave with crystals.

An illustration of crystal cave, with clustered of yellow crystals labeled in groups 1 through 7. Caption says: Explore, Guide, Dig — Crystal cave of a mental landscape, illustrated by yours truly

Much like following the crystals in a cave, conducting semi-structured interviews is about following the mental landscape of the participant. There might be clusters of crystals (thoughts) that exist close to each other. So explore 1, 2, and 3 together before pivoting to 4. When in 4, allow the sight of one crystal take you to its neighbor, hence appreciating the cluster of thoughts in its full magnificence. No point in bringing up 7 out of the blue.

The shape of the crystal also matters. The tippy points of the crystals can be thought of as the points of investigation, specific questions like “are participants able to find the action menu tucked away behind the icons?” We want to meet the participants there. But never arrive by asking leading questions. Instead, with each following question, get a little bit more specific, much like the shape of the crystal getting more pointed as it gets closer to the ground. This way, we avoid projecting ideas onto participants while also eventually getting to the points of investigation.

Bonus tip: Sometimes it’s better to not ask questions

Kyle speaking with emphatic hand gestures: “shut up and listen” — Silence is my secret weapon!

If the three traps of asking questions feel a little bit tricky to navigate, there is always the option to just stay present and listen. I’m not being snarky. Less is more when it comes to asking questions.

If we’re here to establish a partnership with research participants, then, much like on a date — man, you gotta listen. Make the participant feel like you are actively listening, and you just might be rewarded with answers to the unknown unknowns. Resist the urge to fill the silence. Allow participants to mull things over. And if you absolutely need to say something, just repeat their last three words in the form of a question.

“Yesterday, I didn’t speak to a living soul the entire day.”

“The entire day?”

“Yeah, I didn’t have any meetings so I didn’t have to talk to anyone. I kind of like that.“

“Kind of like that?”

“Yeah, it’s just so much peace and productivity. I feel like not everyone has to ask questions all the time. It’s probably just a researcher thing.”

“A researcher thing?”

“Yeah, like, I mean I probably don’t want to spend my day thinking about how to ask questions. I’m happy to just read about it in a blog post.”

This technique, referred to by master negotiator Chris Voss as Mirroring, is a powerful tool to stay engaged in a conversation while allowing the maximum space for the other side to tell their story. After all, that’s all interviews are, collecting stories from real people and trying to represent them with faith. That, combined with the ability to avoid leading questions, split up compound questions, and orderly appreciate the crystals clusters in participants’ minds, would enable anyone to ask questions like a UX Researcher.

Want to dig deeper in the craft of UX Research? Here are some of my favorite books:

Steve Portigal: Doorbells, Danger, and Dead Batteries: User Research War Stories
Robert Weiss: Learning From Strangers: The Art and Method of Qualitative Interview Studies
Stephanie Walter: A Cheatsheet for User Interview and Follow Ups Questions
IDEO: The Little Book of Design Research Ethics

How to ask questions like a UX Researcher was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.

from UX Collective – Medium https://uxdesign.cc/how-to-ask-questions-like-a-ux-researcher-a4e02041136c?source=rss—-138adf9c44c—4

Articles

10 digital twin trends for 2023

Interest in digital twins has picked up over the last year. Digital twin tools are growing in capability, performance and ease of use. They are also taking advantage of promising formats like USD and glTF to connect the dots among different tools and processes.

Advances in techniques for combining models can also improve the accuracy and performance of hybrid digital twins. Generative AI techniques used for text and images may also help create 3D shapes and even digital twins. These kinds of advances will allow enterprises to mix and match modeling capabilities in new ways and for new tasks.

Here are 10 trends to watch for in the year ahead.

1. From connecting files to connecting data

Over the last several years, all the major tools for designing products and infrastructure have been moving to the cloud — but still using legacy file formats to exchange data. Increasingly vendors are calling out the data integration aspects of these tools that make it easier to share digital twins across different tools and services.

Event

GamesBeat Summit: Into the Metaverse 3

Join the GamesBeat community online, February 1-2, to examine the findings and emerging trends within the metaverse.

This capability often starts as a subset of a vendor’s tools. For example, Siemens is rebranding a new subset of its tools as part of Siemens Xcelerator, while Bentley has launched Phase 2 of the infrastructure metaverse. In November, location intelligence leader Trimble launched Trimble One, a “purpose-built connected construction management offering that includes rich field data, estimating, detailing, project management, finance and human capital management solutions.”

It’s one thing to move apps to the cloud simply. These innovators are doing something else: pioneering more efficient ways to connect data across these apps. Over the next year, the other major construction and design tools providers will likely announce similar advances for connecting digital twins and digital threads across different processes.

2. Entertainment firms target the industrial metaverse

Epic and Unreal have made significant progress partnering with digital-twin leaders to provide a better user experience across devices. These companies have announced significant partnerships with GIS, construction and automobile leaders.

Blackshark AI developed the globe behind Microsoft’s latest flight simulator, and went on to scale the tech for automatically transforming raw satellite imagery into labeled digital twins. In April, Maxar, a leading satellite imaging provider, announced a significant investment in Blackshark for Earth-scale digital twins.

Over the next year, more gaming and entertainment companies will find opportunities in the industrial metaverse, which ABI expects to eclipse the consumer metaverse over the next several years.

3. Nvidia galvanizes support for USD

Pixar pioneered the Universal Scene Description (USD) format to improve movie production workflows. Nvidia has championed USD to connect the dots across various digital twins and industrial metaverse use cases. The company has built connectors to the IFC standard for buildings, and is improving workflows for Siemens in industrial automation and Bentley in construction.

USD still lacks support for physics, materials and rigging, but despite its limitations, there is nothing better for organizing the 3D information for giant digital twins. Nvidia’s pioneering work on USD promises to integrate raw data with various industry, medicine and enterprise workflows.

4. glTF simplifies digital-twin exchange

There is growing momentum behind the glTF file format for exchanging 3D models across different tools. The Khronos Group calls it the JPEG for the metaverse and digital twins. Expect gITF to pick up steam, particularly as creators look for an easy way of sharing interactive 3D models across tools.

5. Generative AI meets digital twins

Over the last year, the world has been wowed by how easy it is to use ChatGPT to write text and Stable Diffusion to create images. Meanwhile, others have demonstrated new multimodal tools like DeepMind’s Gato for harmonizing models across text, video, 3D and robotic instructions. Over the next year, we can expect more progress in connecting generative AI techniques with digital twin models for describing not only the shape of things but how they work.

Yashar Behzadi, CEO and founder of Synthesis AI, a synthetic data tools provider, said, “This emerging capability will change the way games are built, visual effects are produced and immersive 3D environments are developed. For commercial usage, democratizing this technology will create opportunities for digital twins and simulations to train complex computer vision systems, such as those found in autonomous vehicles.”

6. Hybrid digital twins

There are a variety of performance, accuracy and use case tradeoffs among the models used in digital twins. Prith Banerjee, CTO of Ansys, believes that in 2023 enterprises will find new ways to combine different approaches to hybrid digital twins.

Hybrid digital twins make it easier for CIOs to understand the future of a given asset or system. They will enable companies to merge asset data collected by IoT sensors with physics data to optimize system design, predictive maintenance and industrial asset management. Banerjee foresees more and more industries adopting this approach with disruptive business results in the coming years.

For example, a healthcare company can develop an electrophysiology simulation of a heartbeat as the muscles contract, the valves open and the blood flows between the heart’s chambers. The company can then take a patient’s MRI scan and develop a simulation of that specific individual’s heart and how it would react to the insertion of a particular pacemaker model. If this R&D work is successful, it could help medical device and equipment companies invent new products and apply for FDA trials by demonstrating in-silico trials.

7. FDA modernization act replaces animals with silicon

Animal testing has been a requirement for all new drugs and treatments since the FDA’s early days. This year, the U.S. Congress passed the FDA Modernization Act 2.0, allowing pharmaceutical companies to replace animal testing with in-vitro and in-silico methods. This will drive innovation and commercialization of patients-on-a-chip and better medical digital twins for testing more cost-effectively and humanely.

Tamara Drake, director of research and regulatory policy at the Center for Responsible Science, told VentureBeat, “We believe in-silico methods, including use of artificial intelligence in conjunction with advance organs on a chip, or patient-on-a-chip, will be the biggest trend in drug development in coming years.”

8. Digital twin ecosystems open new use cases

Matt Barrington, emerging technology leader at EY Americas, predicts that digital twins will increasingly transform how we run companies in 2023. For example, using a digital market twin to evaluate new products will support management and strategic decision-making. Digital twins will also underpin supply chain resilience in uncertain times, and improve risk management, safety and sustainability.

This transformation will require increased emphasis on foundational digital capabilities in data management and devops for data engineering, as well as a more comprehensive approach to security. Barrington predicts fragmentation and a high degree of specialization in the market, such that no single vendor has an end-to-end digital twin solution. Companies will have to integrate several capabilities to create the right fit-for-purpose solution for their business. Part of that approach will require more composable, open architectures and the ability to curate an ecosystem-based system.

9. Enterprise digital twins take off

Vendors have made significant advances in tools for process mining and process capture to create a digital twin of the organization.

Bernd Gross, CTO at Software AG, said these advances allow enterprises to create simulations for an entire department or a cluster of business processes rather than a single business process.

Leaders will find ways to incorporate various technologies, such as process mining, risk analysis and compliance monitoring, to drive more accurate outcomes. These techniques require greater breadth and depth of data. Today, enterprises must include relevant KPIs, causalities between processes, the life cycle of a business unit and more to create a genuinely accurate enterprise digital twin.

10. Digital twins drive 5G

5G delivers significantly faster speeds in direct view of one of the newer towers, but can be slower than 4G in the radio shadow zone. Cellular service providers are engaged in a race to fill in these shadows, and digital twins could help. Fortune Business Insights estimates that the market for 5G cells could grow by 54.4% annually through 2028.

Mike Flaxman, spatial data science lead at Heavy AI, said many telcos are looking at digital twins to shift to a plan, build, and operate model that allows them to maximize service while cutting costs.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

from VentureBeat https://venturebeat.com/programming-development/10-digital-twin-trends-for-2023/

Articles

Expanding the Reach of Design Tokens: How to Use Them in Non-UI Design

Graphical Version of Color Palette

Expanding the Reach of Design Tokens: How to Use Them in Non-UI Design

Design Tokens: The Secret to Consistency Beyond the User Interface

An organization can use design tokens to ensure consistency and coherence across all of its design decisions, not just those related to the user interface (UI). As we’ll see in this post, design tokens can be used to make a wide variety of design elements, including PowerPoint presentations, flyers, ads, and even a company’s printer and other printed materials, more consistent and high quality. But let’s discuss some basics for the people who are new to the concept of design tokens.

What are design tokens?

A design token is a variable that represents a core design element in a system, such as color, typography, spacing, and other interactive and visual properties. Designers and developers can easily access and use these elements as tokens throughout the design process, ensuring that each design decision is consistent with the overall design scheme.

Why use design tokens?

Using design tokens has several benefits. First and foremost, it ensures consistency in the design of a product or brand. All elements of the product or brand can have a cohesive and harmonious look and feel by using the same set of design tokens throughout the design process.
As well as improving consistency, design tokens can increase the efficiency of the design process. Defining design elements as tokens allows designers to easily access and use them, eliminating the need to create common elements from scratch. For larger design projects, this can save a great deal of time and effort.
Finally, design tokens can enhance the maintainability of a design system. By using tokens to represent core design elements, designers can easily update the system to reflect changes in branding or design direction. To stay on top of changing trends or market needs, this can be particularly useful for companies.

Tokens beyond UI design

Design tokens are frequently used in UI design, but they can also be used in PowerPoint presentations, flyers, ads, PDFs, and even physical materials like company printers.

In a PowerPoint presentation, for example, design tokens can be used to define colors, typography, and other visual elements. Regardless of who is creating the company’s presentations, the company can maintain a consistent and professional appearance. A design token can be used to create a cohesive and unified brand image on flyers, ads, and other promotional materials.

Even physical materials like company printers, business cards, and other branded items can be designed consistently with design tokens. By defining the colors, typography, and other design elements as tokens, a company can ensure that all of its physical materials have a consistent and professional appearance, regardless of where or when they are produced.

For a company to use design tokens in their PowerPoint presentations, here are some steps that they can follow:

You must define your design tokens in advance. To start with, you need to define the core design elements that you wish to include as tokens. For example, you might want to include colors, typography, spacing, and other visual elements as tokens.
Once you’ve defined your design tokens, you need to create a library to store them. This can be a simple spreadsheet or a more complex tool like a design system platform like Figma. But I’d recommend making your design tokens available on a platform that is more accessible to non-designers as well.
Make sure that you utilize design tokens in your PowerPoint templates when you create PowerPoint templates for your company to ensure consistency and coherence. For example, if you have defined a particular color as a design token, then be sure that that color is used throughout the template consistently.
The design token library should be updated as needed as you work on PowerPoint templates. If you change the direction of the design while working on PowerPoint templates, you may have to update the design token library. To ensure that all design decisions are consistent with the overall design system, it is imperative to keep the library up to date as much as possible.
Whenever employees are creating presentations, encourage them to use the company’s PowerPoint templates to ensure that all presentations have a consistent look. As a result, all presentations will follow the design system defined by the design tokens, to ensure consistency.

The above steps will assist a company in using design tokens to define the colors, typography, and other visual elements of their PowerPoint presentations, helping to create a cohesive and unified brand image for the company.

Other creative ways to use tokens

Design templates: It is possible to create templates for different types of design work, such as social media posts, email newsletters, and presentation slides, with the help of design tokens, which can help you ensure that all your templates are consistent and professional.
Design marketing materials: Design tokens can also be used to design marketing materials such as flyers, brochures, and ads. This is because they can help create a cohesive and unified brand image across all of your marketing efforts and help to build brand recognition.
Design physical materials: It is also possible to use design tokens to develop physical materials, such as business cards, packaging, branded merchandise, and stationery, so that their appearance is consistent and professional. By using design tokens, you will ensure that all your physical materials are well-designed and look consistent.

Tokens are a powerful tool for ensuring consistency and coherence in user interface design. As long as designers define the core design elements as tokens, and use them consistently during the entire design process, they can create a UI that reflects the unique identity of their brand cohesively and harmoniously. The process of integrating design tokens into user interfaces involves a bit of planning and organization on the part of product designers, but the benefits of improving consistency, efficiency, and maintainability outweigh the effort.

That’s the end of this short yet hopefully insightful read. Thanks for making it to the end. I hope you gained something from it.

👨🏻‍💻 Join my content verse or slide into my DMs on LinkedIn, Twitter, Figma, Dribbble, and Substack. 💭 Comment your thoughts and feedback, or start a conversation!

from Design Systems on Medium https://uxplanet.org/expanding-the-reach-of-design-tokens-how-to-use-them-in-non-ui-design-60aa4a8e87c

Articles

What Are Autonomous Agents?

Here is a super simple example of how an autonomous agent could work.

Why Autonomous Agents Are Such A Big Opportunity

These autonomous agents will exist in every industry and for every task imaginable.

How Autonomous Agents Work

Example #1: Social Media Manager Autonomous Agent

Example #2: Political Campaign Manager Autonomous Agent

Example #3: Math Tutor Autonomous Agent

The Future Of Autonomous Agents

In the next 2-5 years most people will work for an autonomous agent instead of a human.

How To Build And Use Autonomous Agents

Building Autonomous Agents

Using Autonomous Agents

Additional Resources:

How To Meet People Interested In Autonomous Agents

What Did They Know And When Did They Know It? #

Sandy Foundations #

Denouement #

Shrinkage #

Generative AI (Text & Image)

NeRFs (Neural Radiance Fields)

Computer Vision & SLAM

Natural Language Processing & Conversational AI

Automatic Content Creation

Generative AI Startup Landscape:

Tips for asking good, engaging, and productive questions

Tip no. 1: Avoid leading questions at all cost

Interested to read more? Check these out:

Tip no. 2: Avoid double-, triple-, quadruple- barreled questions

Tip no. 3: Avoid rigidly going down a list of questions

Bonus tip: Sometimes it’s better to not ask questions

10. A Modern São Paulo Apartment That Embraces Biophilia

9. A Seattle Home Made of Glass Boxes Lives Amongst the Trees

8. After: “Coastal Grandma Meets Graphic Designer” Is the Vibe of My New Living Room

7. A 17th Century Weaver’s House Transforms Into a Modern Home in Amsterdam

6. Worrell Yeung + Colony Pair up to Transform Historic New York City Loft

5. How Much Function Can Be Added to a 462-Square-Foot Apartment?

4. A California House Topped With Glass Pavilion + Angular Roof

3. Vipp Opens New One-Room Hotel in Old Pencil Factory in Copenhagen

2. Before: I Asked an Influencer to Design My Living Room and She Didn’t Disappoint

And the most popular interior design post of 2022 is…

1. A Colorful + Dreamy, Space-Age Inspired Apartment in Ho Chi Minh City

1. From connecting files to connecting data

Event

2. Entertainment firms target the industrial metaverse

3. Nvidia galvanizes support for USD

4. glTF simplifies digital-twin exchange

5. Generative AI meets digital twins

6. Hybrid digital twins

7. FDA modernization act replaces animals with silicon

8. Digital twin ecosystems open new use cases

9. Enterprise digital twins take off

10. Digital twins drive 5G

Expanding the Reach of Design Tokens: How to Use Them in Non-UI Design

What are design tokens?

Why use design tokens?

Tokens beyond UI design

Other creative ways to use tokens

Top 10 Study Guides

Bonus: Top 5 Articles from Last Year

See Also