What AI should beginning authors use? Over the last few years, the answer has been to use ChatGPT for everything. But as I record this, ChatGPT is starting to fall behind. Almost everything ChatGPT can do, one of its competitors can now do better. It is not that ChatGPT is bad, at least not yet, but it is starting to feel like AOL.
Back in 1998, AOL was a fine way to get on the internet, but other ISPs were emerging that were better. ICQ was better than AOL Instant Messenger, and cable was faster than dial-up. If you do not get those references, ask your parents what an AOL CD or a floppy disk is and watch them get nostalgic.
Some authors only use ChatGPT and do not bother with other models. If that’s you, you might be wondering why you should even listen to this episode. There is a lot to be said for using the best tool available, and if you are not careful, you could end up being the only person still using AOL.
Which AI tool is best?
Thomas: There are over a thousand large language models that are not made by OpenAI. So how do you know which tool is best for the job, and which one is best for you?
I asked The Nerdy Novelist, Jason Hamilton, who teaches authors how to make AI work without sacrificing ethics or creativity. He is a fantasy author and the host of the Nerdy Novelist YouTube channel.
What is the difference between an AI model and an AI tool?
Jason: An AI model and an AI tool are different. Models are one thing, but the tool I use the most right now is Perplexity. I like it as a chatbot. It does everything other chatbots do and more.
Perplexity started as a search replacement. Instead of using Google, you could use AI, and Perplexity did that very well. Now it does much more. One reason I like it is that it includes multiple large language models. With ChatGPT, you are limited to OpenAI’s models. With Claude, you only use Claude models. With Perplexity, you get access to several major models, including DeepSeek, plus access to major models from Claude, ChatGPT, and Gemini. Perplexity also has proprietary models tailored to search.
How do the latest LLMs and AI writing tools compare to each other?

Thomas: I have been impressed with Perplexity every time I use it. I think of it as a research assistant. You can give it a project like, “Go through these hundred piles of hay and find all the red needles.” It might say it will take twenty minutes, which is fine because it would take me two days.
It is excellent for original research. You are not relying on an anonymous blog post for research. It actually looks up laws, historical facts, and primary sources.
Jason: It is not the only platform doing search now, but I think it does it best. I use it as a replacement for Google.
It is also an effective technical helper. If something is going wrong with the software, I describe the problem, and it searches Reddit posts and official documentation. Nine times out of ten, it gives me the exact solution.
Thomas: My daily driver is Grok for similar reasons. Using the Grok app feels a lot like Perplexity. It searches hundreds of webpages quickly and also scans X posts.
If you need insight on a current event that happened in the last thirty minutes, Grok can answer it because it reads real-time posts. It is not quite as strong for deep research as Perplexity, but it is fast and effective.
I have noticed that ChatGPT is increasingly hesitant to pull webpages, even when you provide the exact link to documentation. It often asks you to paste the content instead of reading it itself. Neither Grok nor Perplexity has ever asked me to paste a webpage, but ChatGPT increasingly asks users to do that extra work.
Jason: It is funny you used the AOL metaphor. I made that same connection recently while watching a retrospective on the 1990s. AOL had a strong head start, but then Google and Microsoft arrived and drowned it out. I see something similar happening with ChatGPT.
Thomas: Right down to merging with major media companies. AOL merged with Time Warner. OpenAI just signed a billion-dollar deal with Disney. They are following a similar path.
What does ChatGPT still do better than competitors?
Thomas: One feature I still like is ChatGPT Projects. It saves context across multiple threads, so they influence each other. When I build Patron Toolbox tools, I use a dedicated project that remembers decisions across tools.
Gemini does not have anything like this. Grok’s project feature is not as robust. It shares instructions but does not share context as deeply. ChatGPT is still the best in this specific area.
Jason: Perplexity and Claude both have similar features. Claude introduced projects first, and ChatGPT copied the idea. They do a good job, but I personally prefer to turn the memory off. I do not want my prompts influenced by information I do not control precisely. That fits my use cases, though I can see how memory is valuable for productivity and business workflows.
How do you use AI for productivity and automation?
Thomas: You use AI more for brainstorming and writing. Do you also use it for productivity tasks like inbox management or marketing?
Jason: Less so right now, but I am moving in that direction. My favorite area is automations using tools like Make or n8n. These platforms let you build custom software workflows easily.
I built an automation that takes a book, summarizes each chapter, and then creates marketing materials. That includes book descriptions, ad headlines, seed keywords for tools like Publisher Rocket, and email launch sequences.
I keep adding to it, and it has been incredibly helpful. There are countless applications like this using automation platforms.
Is AI more like an operating system or a chatbot?
Thomas: I had a major shift in perspective on AI recently. AI is not really a chatbot. It is more like an operating system where you build software using natural language.
You tell it, “Summarize the chapters, then create an email sequence, then write a book description.” It is not perfect yet, but it is dramatically easier than it was five years ago.
The chat interface is often the worst way to interact with AI. It encourages people to anthropomorphize the system, to seek validation, or to ask, “Am I a good writer?” That is not what it is for.
AI is a tool to help you write more books and market better. It is not your friend. It does not have a soul.
Jason: An LLM is not the same as a chatbot. A chatbot is software built around the model. The model is like electricity, and the chatbot is the appliance.
An LLM is not the same as a chatbot.
Jason Hamilton, the nerdy novelist
A chatbot is software built around the model.
The model is like electricity, and the chatbot is the appliance.
Chatbots became popular because the back-and-forth interface is simple and versatile. You can use them for therapy-style conversations, writing, marketing, or multimodal tasks like image and video creation.
But the chat format is a limited way to interact with AI. There are more direct and efficient ways to interact with models.
For example, I can give an AI my book in Google Docs. If the chapters are properly formatted, it can automatically run prompts on every chapter to summarize, proofread, or analyze issues. I just upload the file, click a button, and it does the work without further interaction.
Thomas: I reached a similar place with Patron Toolbox. The earliest tools required users to fill out many fields. For example, to create a pitch, the tool asked who the protagonist is, what the protagonist wants, and what obstacle stands in the way. It walked users through building a pitch and then fed that information to the model, which generated back-cover copy, a blurb, and a pitch.

As I gained experience building tools and as context windows grew larger, many tools became much simpler. Now you upload your book and click a button. The system figures out who the protagonist is and what that character wants.
Some people wanted a field where they could specify the genre, but I intentionally avoid that. The tool tries to infer the genre on its own. If it guesses wrong, that may reveal a real issue with the book. You might think the book belongs to one genre, but the tropes and narrative approach point to another.
If the AI reader is “confused” about your genre, your human readers will be too. But it’s better to learn about your genre problems from AI before publication rather than hearing it from readers who leave a bad review on the World Wide Web. If your genre is unclear, it can hurt reviews and reduce marketability.
Most of the new tools I build ask you to upload the book, and the tool does its analysis.
What is the Roast Engine and why build it?
Thomas: The most recent tool I built is called the Roast Engine. It includes more than forty simulated angry trolls that leave one-star and two-star reviews. The model reads your book and selects the twelve most relevant trolls to review it.
Each troll cares about different aspects of a book. One focuses heavily on worldbuilding. If you write a romance, that troll will have nothing to say, but there are romance-focused trolls as well.

The idea started as a joke among my Novel Marketing patrons, but people became genuinely interested.
AI systems tend to be overly flattering unless you work very hard on prompting. I wanted to see how far I could push in the opposite direction. It turns out you can push it quite far. The tool is painful to run, but it is better to receive harsh feedback privately from an AI that will always leave a negative review than to receive it publicly from a human. Public negative reviews live on Amazon for the life of the book.
Jason: That is a great application. There is a tool called AutoCrit that recently released similar features. It allows you to create a genre-specific “beta reader” that behaves in a particular way when giving feedback. It is a fascinating approach.
When should authors use Claude vs. Gemini?
Thomas: Many writers consider Claude the best model for writing. It is not always number one, but it almost always ranks in the top three. Depending on how recently a new Claude model launched, it is often ranked first or second.
The other major model you have been using is Gemini 3. Google does not release new versions often, but when it does, Gemini tends to jump straight to the top of the benchmarks. As we record this at the end of 2025, Gemini ranks in the top one to three on most evaluations.
When do you use Claude, and when do you use Gemini?
Jason: Claude has been the most consistent model for creative writing. Even back in the days of GPT-3.5 and GPT-4, Claude 2 already felt stronger for prose. Looking back now, it is funny how good we thought Claude 2 was at the time.
Thomas: This may be a controversial opinion, but I think classic GPT-4, not 4.0, was the best GPT model. It was genuinely excellent. I felt 4.0 was a step backward. It was more efficient, but not better.
One problem OpenAI faces is scale. They are so popular that they optimize for efficiency because they cannot build servers or acquire GPUs fast enough.
Jason: The current Claude 4.5 lineup includes Haiku, Sonnet, and Opus. Haiku is the cheapest and least powerful. Opus is the most powerful. Sonnet sits in the middle for both cost and performance.
Sonnet and Opus are especially strong for creative writing tasks.
Thomas: Claude also has the clearest naming scheme. OpenAI’s naming has been confusing. At one point, it had 4.0 and o4, which were entirely different models.
Claude’s names make intuitive sense. Haiku is short, sonnet is medium, and opus is long. The version number tells you how new the model is. They deserve full credit for clarity.
Which model works best for first drafts?
Jason: I use AI for some first-draft writing. Opus has been the strongest model for that stage, particularly for prose quality. The newest Opus 4.5 is especially compelling because it costs about a third of the previous version.
Opus 4.1 was extremely expensive. Opus 4.5 reduced the price dramatically, which made it far more practical.
Gemini 3 is the first model since the days of Claude 2 and GPT-4 that truly rivals Claude’s top model for creative writing. That said, they feel quite different.
I ran polls where I wrote the same scene using both models and asked people to choose which they preferred. Most people leaned toward Opus.
Opus tends to be rich and dramatic. That works beautifully for certain scenes, but it can become exhausting over the course of an entire book. I have written roughly two books’ worth of content with Opus, and it often treats every interaction as if the world is ending.
For climactic scenes, Opus is my first choice. It excels at dramatic, high-impact moments. Gemini 3, however, feels more natural to me. I think it handles dialogue better.
The two models are very close. I have even considered drafting dialogue with Gemini 3 and then having Claude fill in description while preserving that dialogue. It is about combining strengths.
Thomas: That is where things are heading. Authors using AI are starting to think more like directors or producers. You choose the right “actor” or “director” for each scene.
Different models suit different tones, genres, or moments. What feels too dramatic to one author may not be dramatic enough for another. “Best” becomes subjective.
Does genre influence which model works best?
Thomas: Some scenes require strong spatial reasoning and an understanding of physics, such as fight scenes. Romantic dialogue does not. Certain models handle spatial reasoning better than others, so mixing tools can make sense.
Jason: Absolutely. Writing contemporary romance is very different from writing epic fantasy. I write fantasy, so most of my testing focuses there.
Fantasy dialogue has a distinct tone. Some older models assumed fantasy meant archaic or Shakespearean language, which required extra work to correct.
You have to test models yourself. You cannot rely solely on someone else’s recommendation. Your genre and style may work better with a different model.
How can authors prime the AI to align with their own writing style?
Thomas: When you are writing a book using AI, do you load your own writing into RAG to prime the pump stylistically? Do you try to make it feel like writing you have already done?
Jason: Yes, but I do not use RAG for that. I put the material directly into the prompt. It is expensive, but I include it anyway, sometimes even in the system prompt, because I want the output to adhere to my style as closely as possible.
In the automations I build, I include both a 5,000 to 6,000-word sample of my writing and a style sheet I created with AI. I had AI analyze my work, break down the elements that make my style distinctive, and then lay those out clearly. I include both of those in the prompt every time.
Why not use RAG (Retrieval-Augmented Generation) for style matching?
Thomas: That gets expensive because you are paying for the tokens in that 6,000-word sample every single prompt.
Jason: It does, but it is much more effective than RAG. With RAG, the system retrieves what it thinks is relevant for that prompt, and it may not retrieve everything you want.
Thomas: And RAG content is chunked and vectorized, which lowers fidelity.
Jason: For anyone wondering what RAG is, it is a different way of retrieving information.
Thomas: I use RAG a lot in Patron Toolbox. The model I use most for that kind of work is Grok 4 Fast Reasoning. I like it because it can hold up to two million tokens of context in RAG without vectorizing or chunking anything.
You can load an entire book into Grok Fast Reasoning, and it can remember the whole thing. That makes it excellent for compendium-style tools, such as listing all plants and animals, factions, characters, or any task where you need to find a needle in a haystack across a full manuscript.

Grok Fast is particularly good at that. Its naming scheme is not great, though. There is regular Grok and Grok Fast. Unlike other platforms where “fast” means cheaper and weaker, Grok Fast is actually better because it has a much larger context window.
Grok Fast can handle up to 2,000,000 tokens, while regular Grok handles around 200,000 to 300,000. That is not a limitation for most projects, but for epic fantasy authors, it is huge. A 150,000-word novel can easily exceed 200,000 tokens, since many words take multiple tokens.
You can think of a token as roughly similar to a syllable, although it is not quite that simple. That is also why AI struggles with tasks like counting the Rs in “strawberry,” because “strawberry” is two tokens.
Thomas: Grok is not as strong at writing as Claude or Gemini. Sometimes, even GPT can write better prose. That said, the newer versions of Grok are improving. Grok 4.1 is more conversational and less hyper-nerdy.
Grok tends to have a personality that feels like an enthusiastic, friendly, awkward nerd at a party. That works well for marketing, analysis, and research tasks, but it is not always ideal for rewriting scenes.
How do the models compare on censorship?
Jason: Grok is much less censored than Claude or GPT.
Thomas: It is, although it is still more censored than many open-source models. Open-source models like Kimi K2 or DeepSeek often have fewer restrictions.
Jason: Some of those are still censored, but Grok will do most of what people want. The advantage of Grok is that it is a very powerful reasoning model. Even if its writing is not quite at the top tier, it can be very effective when used correctly.
Thomas: I have worked with Alexander McCree, who is building a conservative AI chatbot designed not to censor conservative topics. While helping him commercialize it, I found that ChatGPT resisted many of the techniques he used.
When I tested the same approaches with Grok, it was much more willing to follow the prompt. He gave me permission to create a version inside Patron Toolbox, and that version uses Grok.

I found Grok was willing to go where the prompt led it rather than putting on what feels like a social justice filter. For example, it did not resist the idea that Western civilization or Christianity could be discussed as forces for good.
Grok still leans politically progressive unless you nudge it, but if you do nudge it, it will go there. I was generating book club discussion questions, and it started producing odd gender role questions that were not appropriate. I told it, “Give me less woke, more conservative questions,” and it immediately adjusted.
Claude would never do that. It is the most locked-down model I have encountered in terms of safety parameters.
Are other models becoming more flexible?
Jason: They are. All the major models have become more flexible. Gemini 3, for example, can produce fairly edgy content if you guide it properly. People often ask whether models can handle not-safe-for-work content, so I test that.
It resists blunt requests, but if you provide sufficient context and guide it carefully, it can go there.
Thomas: Gemini was famously very progressive a year or so ago, especially with its image model. I have not tested it much politically recently. Have you?
Jason: I tend to be politically neutral, so I have not tested that extensively. From what I have seen, though, most major models have opened up over time.
In the past, including a word like “bomb” in a prompt would shut the model down. The system prompting has become much more sophisticated as the models themselves have improved.
How has competition changed AI censorship?
Thomas: Competition has played a big role. When DeepSeek was released, it felt like Sputnik going into orbit. Suddenly, everyone realized they were behind.
DeepSeek is censored in very different ways. It will talk about bombs without hesitation, but it will completely shut down if you mention Tiananmen Square or the Chinese Communist Party.
What should authors know about open-source models?
Thomas: There are two broad types of open-source models. The first are large Chinese models that run on cloud servers, similar to Grok or Claude. These require powerful infrastructure and include models like Kimi K2, some DeepSeek variants, and Qwen.
Many Americans are uncomfortable trusting Chinese hardware, so these companies release the software as open source. Americans can then run the models on American servers using services like OpenRouter, which allows you to avoid sending data back to China.
The second category consists of small models designed to run locally on your own machine. I started experimenting with these today, and for most authors, they are not worth it.
Jason: I do not even bother with them.
Thomas: You take a major hit in intelligence to get a model small enough to run locally, even on a powerful computer. You also take on a huge cognitive burden. You have to install and configure everything yourself.
When you use Grok or Claude, things like temperature are handled dynamically. If you run a model locally, you must set those parameters manually, which requires understanding what they mean.
I will not even explain temperature here because it is not important for most authors.
That said, many open-source models are very strong. On some writing benchmarks, Chinese open-source models occupy the top spots, even above Claude.
Are open-source models useful for writing?
Jason: I have not tested Kimi K2 yet, but I have tested Qwen and DeepSeek. I like Qwen models for creative writing, and even more for brainstorming and outlining.
Because they are trained primarily in a Chinese cultural context, the ideas feel more out-of-the-box compared to American-trained models. When I want fresh ideas, Qwen can be refreshing.
Many models tend to produce generic ideas unless you structure prompts very carefully. Using models like Qwen, and likely Kimi K2, can help generate ideas that feel fresh.
Thomas: American models are trained on a lot of fanfiction. Fanfiction sites have licensed much of that material to AI companies, although I would need to double-check the specifics. Licensing is one way those sites support themselves financially.
Jason: There is also the reality that if something is publicly available on the web, it is generally treated as fair game for training. There is an implied permission when content is freely accessible online, at least under current legal interpretations. There are exceptions, but broadly speaking, that is how training data has been handled.
Is it legal for AI to train on copyrighted books?
Thomas: This is an important topic because many people are concerned about copyright. So far, courts have been clear that an AI reading a copyrighted work is not meaningfully different from a human reading it.
AI does not store text word-for-word. It stores information in a vectorized, mathematical form, similar to how human memory works.
Jason: It is learning the mathematics of a paragraph.
Thomas: Exactly. It gets technical very quickly, but one court used the phrase “exceedingly transformative.” That places AI training under fair use, similar to parody.
Copyright is not absolute. There are exclusions. For example, you can dream about a book you have read, even replay scenes in your head, without violating copyright. That mental derivative work is obviously not infringement.
Jason: Maybe in some Black Mirror future, there will be an episode where you can get sued for your dreams.
Thomas: None of this has reached the Supreme Court. No laws have been passed specifically governing AI and copyright. It is unlikely that will change soon. In 2025, Congress passed only a handful of laws, which suggests the legislative branch is barely functioning as a lawmaking body.
The courts are also in no rush to act. As a result, AI companies are largely doing what they want. There are no meaningful legal speed bumps.
Politically, both parties in Washington agree that the West needs to win AI. From an elite perspective, AI is not about copyright or even money; it’s about power.
The same technologies that power chatbots also power weapons systems. Whoever controls AI in this era controls the next era.
That is why concerns like electricity shortages or climate change have taken a back seat. Power is power. Many people who once focused on environmental issues now see AI as the path to influence and control.
Some even oppose widespread AI adoption, not because they dislike AI, but because they do not want competitors to have access to it. For them, AI represents power over humans.
How is China using AI?
Thomas: China is the clearest example of this approach.
China operates a massive AI-driven surveillance state. Everything you say and everywhere you go is tracked. AI systems generate a social credit score that determines how many rights you have.
The lower your score, the fewer freedoms you enjoy. This is beyond the scope of this episode, but it is worth researching. It is also why China’s AI capabilities are genuinely concerning.
At the same time, Chinese models are very good at writing because they are trained on literature that Western readers have never encountered. Tropes that are cliched in Chinese folklore feel fresh in English because the English-reading audience has not seen them before.
These story structures have worked for thousands of years and billions of readers in the East. Filtering them through models like Kimi K2 can introduce Eastern storytelling patterns to Western audiences in an interesting way.
Can AI help solve problems beyond writing?
Jason: For people concerned about environmental or global issues, AI is actually one of the most promising paths forward. It may help solve clean energy challenges, climate issues, and medical problems.
The rapid development of COVID vaccines was accelerated by AI-driven protein folding research. These models are being applied far beyond writing and business. They are driving innovation across nearly every field.
That is another reason the United States wants to stay ahead in AI.
How is AI transforming historical research?
Thomas: AI is also powerful for translation. Many historical documents are effectively frozen in their original languages. There are millions, possibly billions, of documents that will never receive grant funding for translation.
I follow Roman Helmet Guy on Substack. He focuses on Venetian diplomatic and intelligence reports. For centuries, Venetian spies and diplomats submitted detailed reports to the Venetian Senate about events in places like Constantinople or Germany.
These reports offer a remarkable third-party perspective on world events because they were written to inform real political decisions, not to satisfy historians.
For hundreds of years, these documents were only available in Italian. Roman Helmet Guy is using AI to translate them and release them publicly. He is publishing new translations regularly.
Reading these reports is fascinating. They provide an uncensored view of the world as it was, directly from people who were there.
If you are writing historical fiction set in the 1500s, you should explore this work.
How are you using AI for Arthurian research?
Jason: That example resonates with me because I am doing something similar. I write Arthurian fantasy, and one of the most influential sources is the Vulgate Cycle, a massive collection written in Old French in the twelfth or thirteenth century.
The most influential English adaptation is Le Morte d’Arthur by Sir Thomas Malory, which condensed the Vulgate Cycle centuries later.
The only complete English translation of the Vulgate Cycle was published in the 1990s and is under copyright. It consists of about ten volumes and costs roughly $400. I own it, but it is expensive and inaccessible.
The original French versions are public domain and freely available online. I am using AI to translate those texts and make them more accessible to Arthurian enthusiasts.
Thomas: For your own writing, you could load that entire corpus into RAG, not into the prompt because of token costs, but into context. That would give your stories a grounded connection to the original material.
Arthurian history is frustrating because we have so few contemporary records. Some scholars question whether Arthur existed at all.
Jason: He definitely existed. It just looked very different than the legend we imagine today.
Thomas: I have been experimenting with something similar using Jane Bot and Twain Bot. These are AI beta readers loaded with the complete works of Jane Austen or Mark Twain, along with biographies and Wikipedia content.
I recently migrated Jane Bot to Grok 4 so it could hold all of Austen’s work in context. It is fascinating to see how AI can bring historical voices into the present through translation and contextualization.
Why do many authors have a bad experience with AI?
Thomas: Many authors who dislike AI have only used the free version of ChatGPT with a single-sentence prompt. The result is usually unusable.
Jason’s approach is different and sophisticated. It requires understanding the tools and treating AI like an instrument, not like a vending machine.
AI does not work by typing “write me a book about a wizard” and receiving a finished novel.
How should authors decide what to delegate to AI?
Thomas: Give AI the parts of writing and marketing that you do not enjoy. If you love writing rough drafts, do not hand that off. If you hate editing, AI can help. These tools are surprisingly good at catching typos and spotting developmental issues.
On Author Update, we discussed a report from Publishers Weekly that said 11% of traditional publishers are already using AI for editing, and that is just what they admit. The real number is likely higher.
What tools provide access to multiple AI models?
Thomas: If you want to try different models, there are services that let you access multiple systems for one monthly fee, rather than paying for several subscriptions. T3 Chat is one of the cheapest at around $8 a month. Straico is in the $10 to $20 range. Magai is around $20 per month.
Magi allows you to keep your conversation context when switching models. You can talk to Grok, then move that context to Claude, and switch back and forth.
We are also planning, in early 2026, to add a new Patron Toolbox tier that gives access to multiple chat models, including Perplexity, Claude, and Gemini. We are still working out the technical details, so I can do it without bankrupting myself. We will do a limited launch first to make sure the economics work.
Jason: Perplexity is my favorite mainstream chatbot because it includes access to several major models.
I use OpenRouter a lot because it has access to almost every model you can think of. It is pay-as-you-go, which can be more affordable for many people.
With OpenRouter, you load a balance of $20. Depending on the model you use, it deducts a little at a time. Cheaper models use very little. Expensive models deduct more. When the balance runs out, you can reload it.
I also like OpenRouter because it connects directly to other tools I use and talk about. The main one lately is n8n, an automation platform. That is more complex, so I would not start there until you are ready.
It also integrates with fiction-focused tools like NovelCrafter and Raptor Write. Both connect to OpenRouter, so it is worth exploring, especially if you want to connect models to writing software.
Thomas: Sudowrite is a tool built specifically for writing books and connects to Gemini 3, Claude models, and other major systems.
It also has useful features, such as a “show, don’t tell” tool. If you want to dabble, Sudowrite is a strong starting point because it has a lot built in.
Jason: I have tutorials on Sudowrite. It is one of three “wrapper tools” I consider worth checking out. The other two are NovelCrafter and Raptor Write. All three have different pros and cons.
I do not use Sudowrite as much anymore, but its proprietary model is very good. It may be less necessary now with models like Opus 4.5 and Gemini 3, but Sudowrite is still one of the big three.
Thomas: Most writing benchmarks are created by AI people. They have AI write, then have other AI critique that writing and produce a scoreboard. Humans are not involved at any point. Having a human writer like Jason doing evaluation, even if it is partly “vibes,” is genuinely useful.
Jason: I used to be more scientific, but I found that scoring even one model thoroughly could take all day. I stopped doing that level of analysis.
Now I review major model releases and share my thoughts as someone who has written books, including several before AI was around, and who worked at Kindlepreneur for a couple of years.
I have a solid sense of what makes decent creative writing. The official benchmarks rarely match my real-world experience.
Thomas: That is exactly why I wanted this conversation. I also want you to test Grok 4.20 when it comes out. Early reports suggest it may be a major leap forward. It is supposed to be released either by the end of the year or in the first couple weeks of January, so keep an eye out for Grok 4.20.
What final advice do you have for authors exploring new AI tools?
Jason: The most important thing is not to feel overwhelmed. The moment you feel overwhelmed is the moment you stop, and the moment you stop is the moment you give up the potential of what this can do for you.
If you are overwhelmed, it is okay to stick with ChatGPT. You can do a lot with it. But have fun, experiment, try different tools, and watch what other people are doing. You can also invest in a course or two, or watch my YouTube channel.
Most people think “writing with AI” means you type a one-sentence prompt and a book appears. That is not how it works. There are many approaches beyond that.
Even if you never want AI to write prose for you, it can still make your author business more enjoyable and free you to focus on what you love.
Where can authors learn more about using AI in their writing process?
Jason: The best place to start is my YouTube channel, Nerdy Novelist.
For more structured training, help, and support, I have a membership group called Story Hacker Gold that walks you through the steps to produce a quality novel with AI, not “slop.” It is only slop if it is bad and readers do not like it, so we focus heavily on how to do this well.

