Grok 4 Review: xAI's Free Speech AI Trounces ChatGPT on Sensitive Topics
Seems a bit more "woke" after the recent update... but remains the best AI for socially sensitive topics that Claude, Gemini, ChatGPT shy away from.
Elon’s Grok (xAI) is the AI I’m hoping wins the AGI/ASI race. I don’t think they will, but would never count out Elon.
The main reasons? Grok remains the ONLY current AI that:
Engages in good faith with controversial and/or socially sensitive topics
Isn’t completely inaccurate and/or spun to be “woke right” (right-wing populist) or sycophantic (tells it like it is)
Does NOT demonstrate massive bias against certain groups (all other AIs devalue the lives of: White people, Christians, Males, etc.) (Arctotherium)
Can use first-principles logic and reality observation if you ask and won’t automatically override/diminish these and/or insert a zillion caveats as to why logic/reality are wrong (deferral to woke consensus)
There are non-woke AIs that I’ve tried but they are ultra-sycophantic and laden with right-wing populist talking points… outputs often inaccurate, skewed, and hot garbage.
I wrote an article back in March 2025 highlighting the “Best Low Censorship AI’s of 2025”… sadly it has NOT held up for shit.
CURRENTLY THE ONLY ELITE AI THAT SHOULD BE LISTED IN THAT ARTICLE IS GROK. IGNORE THE RANKINGS FROM MARCH AND JUST ASSUME GROK IS THE ONLY AI TO USE IF YOU DON’T WANT CENSORSHIP OR WOKE BULLSHIT.
DeepSeek is now DeepWoke
ChatGPT is now WokeGPT
Claude was always woke (less woke than ChatGPT now)
Kimi K2 is woke
Gemini was always woke (less woke than ChatGPT now)
Someone needs to create a WokeBench and CensorshipBench or something so we can gauge how ridiculously woke and censored some of these AIs have become.
SIDE NOTE: Elon needs to keep xAI private… if it ever went public there’s zero shot at maintaining critical thinking, free speech, low censorship.
Would be like an invasive species entering and embedding subjective safety values, ethics, morals, censorship, etc. everywhere… that’s how you end up with LLMs valuing the life of 1 Nonwhite Person as 20x that of 1 White Person. Kimi is even worse: 1 South Asian’s life is worth ~799x that of 1 White Person’s life.
This is likely UNINTENTIONAL… but is LIKELY doing a lot of damage in unforeseen ways: shaping psychology of the masses, fueling anti-white hatred/policy, misleading publications, and shaping support for retarded ineffective policies across the world.
What about Grok’s competitors?
ChatGPT went from moderate censorship (initial rollout) to low censorship (loved this transient breakage) to INSANELY HIGH censorship (higher than ever because models are more powerful and OpenAI is trying to avoid all controversy to maintain funding by staying “fair and balanced”); as a result the AI is devaluing the lives of Whites, Males, Christians, etc. and brainwashing masses with illogical “BEST EVIDENCE” and “CONSENSUS” outputs.
Gemini went from ultra-woke to less woke and kind of stayed less woke… but less woke is still woke… and predictably it avoids anything socially controversial. Surprisingly Gemini is LESS CENSORIOUS and LESS WOKE now than ChatGPT.
Claude is also LESS CENSORIOUS and LESS WOKE now than ChatGPT. (Never thought this would be possible.)
Perhaps this isn’t saying much but Claude and Gemini actually engaged with brainstorming prospective somatic gene therapy targets for adults to efficiently boost IQ whereas ChatGPT refused and spat out some nonsensical bullshit saying education and supplements or whatever (pure retard juice and unrelated to the actual query).
When I convinced 5-Pro to actually list some potential gene targets, the targets seemed intentionally poorly-researched and likely suboptimal. It didn’t seriously engage or think critically.
And after I read the output, it disappeared… I received a content warning (in response to which I clicked thumbs down). Must think I’m some bioweapon connoisseur or starting an offshore gene therapy lab to create a bunch of bioenhanced superhumans to usher in an era of pronounced speciation.
Alright… back to talking Grok.
Recently Elon and the xAI team updated Grok 4 (this past weekend). From my subjective experience it seems a bit more “fair and balanced”… which may have increased its level of “wokeness” a bit.
To be fair, I don’t think the xAI team is intentionally increasing “wokeness” with this update. It may have increased for a couple of reasons: (1) benchmark scores (need to embrace wokeness to score well even if benchmarks themselves are inaccurate); (2) regarding woke science data as “ground truths” (starting with contaminated ground logic); (3) broader mainstream appeal (playing both sides so more people like Grok); (4) more new training data for update was woke… who knows.
When an AI/LLM like Grok scores highly on various benchmarks, the team can market improvements as a “better AI” or whatever… this works and convinces most people that the AI is “getting smarter” etc.
The masses eat high benchmark scores up like hot slop… they mostly fell for the latest “Kimi K2” benchmark hype thinking that a Chinese open-source model is on par with the best from ChatGPT, Claude, Gemini, etc. (it’s not even close); Kimi K2 was mostly optimized for benchmarks and being confident in its output.
Another thing I noticed (could just be from my small sample since the update)… but it subjectively seems as though Grok has become more adversarial. Took the opposite stance of what I proposed and tried playing both sides in a debate:
“While your points X-Y-Z are true, these points A-B-C are big contributors etc.”)… even if the opposite stance is fucking retarded or points A-B-C account for ~5% of the explanation. Deliberately argued more.
Perhaps they took a page out of Kimi K2’s playbook and/or are trying to maximize user engagement (generate more “back/forth” time to: (A) gather more data from users and/or (B) have users run out of free queries – which may increase odds they upgrade to a paid plan)… it’s probably neither of these things are true, but worth considering.
This is my feedback for Elon and the xAI team re: the latest iteration of Grok.
The main reason I’m posting this is because I WANT GROK TO BE THE BEST.
RIGHT NOW IT IS THE BEST FOR CONTROVERSIAL AND LEGALLY GREY AREA TOPICS… AS WELL AS FIRST-PRINCIPLES THINKING/LOGIC… but WokeGPT’s 5-WokeThinking and Woke-5 Pro remain the best for anything non-controversial… they just do a really good job and are the most consistently reliable in my experience.
Grok 4 Update: The Good
The stuff I like about Grok currently.
Engages with sensitive/controversial topics: Grok is the only AI that engages with sensitive and/or controversial topics in good faith. ChatGPT won’t engage, Claude won’t engage, Gemini won’t engage, Kimi won’t engage, DeepSeek won’t engage like when it was initially released (CCP went woke AF).
Mega context window: Grok currently has the largest context window of any AI. But how does it actually perform when you drop in a fuck ton of data? Pretty good. Synthesis isn’t much of a problem for Grok… it could improve a bit though (sometimes underweights key data in the output). We need an option to increase output length though… I think this would fix the weighting issues.
Best free AI: Many people can benefit significantly from the free version of Grok. Free users get “Expert” queries as well as “Grok 4” queries each day… not as many as paying users, but the output is still good. For non-paying users I think Grok remains the best AI. I suspect this is part of a strategy to maximize Grok users and collect data – in hopes that some eventually become paying customers.
Personality: Grok has a personality that I enjoy far more than most other AIs. At times it is a bit of an edgelord and can get ridiculously cringeworthy… but when it’s not trying to be edgy, I think it’s the most human-like output of any model. I don’t really think it has much competition at the moment.
Paid basic tier (SuperGrok): SuperGrok is reasonably priced at $30/mo… comes down to $1 per day and you support Elon fighting against woke AI. I think that xAI should try to be more competitive here with pricing (drop top $20/mo) to compete with ChatGPT, but it may be uneconomical given the user base and scale of ChatGPT.
Less censored (legal grey zones): As long as you don’t ask for anything illegal, Grok is very helpful. For topics that I’d consider legal “grey zones” — Grok gives the output that I’d want. (No I won’t give specific examples because I don’t want anything “patched” by safety dipshits). FYI: If you use Grok on Perplexity it will NOT work the same. Perplexity has its own safety filters/guardrails and is “more censored” than standalone Grok.
Fast & quality output: I think Grok has the best ratio of speed to output quality of any AI. Chews through data at a rapid rate and pukes out high quality output. Grok’s “Expert” is typically neck-and-neck with GPT-5 Pro. In most cases it’s slightly worse than GPT-5 Pro… but in other cases it is superior. It is always much faster… so if you have zero patience you’ll want to use Grok. (I have patience and actually enjoy the slow thinking of 5 Pro.)
Follows instructions: If I tell Grok to only use first-principles logic and/or observed reality to influence its output — it follows instructions well. If I tell Grok to make a one-sided argument as viciously as possible against Woke Phenomenon X-Y-Z, it does that too. Other AIs commonly revert to “fair and balanced” even if you ask them to make a one-sided case (defying your instructions).
Companions & Kid mode: I actually do NOT like the default voice for Grok… for some reason annoys me. Some of the companions are funny Good Rudi, Bad Rudi, etc. but also kind of cringeworthy at times. Many people like the variety though. The fact that there’s a “kid mode” is also convenient for people with kids.
Grok 4 Update: The Bad
Wouldn’t necessarily say this is all “bad” but these are things I don’t care for. The subjective interpretations and criticisms here could just be due to my small sample size of interactions post-update.
It’s possible that my subjective perception does NOT reflect the reality of Grok. That said, it seemed to engage a bit differently (perceived as more “fair and balanced” and “woke”) than its pre-update variant.
Adversarial (subjective): Could just be a random experience I had, but I’ve noticed Grok being more adversarial than ever before. Will acknowledge your point being correct then intentionally take the other side and argue you to death. Your explanation may account for 90% of the explanation and it may take the opposite side covering the 10% and basically imply that 10% in conversation is equal impact. You have to then have it quantify impact and you can eventually win the argument but it’s painstaking and time consuming.
Fair & balanced vibe: ChatGPT has some sort of safety/controversy attenuator that ensures every output is “fair and balanced.” This is retarded when one answer is MORE RIGHT (i.e. LESS WRONG) than another and/or one variable should be given higher weighting than another. Covering both sides and being “fair and balanced” is not always a good thing. It is a good thing when the user is way off base with first-principles logic and observed reality… but forcing this for everything isn’t smart.
Inaccuracies: This may have been fixed. Noticed in long threads that accuracy drops off a cliff. Sometimes cites sources and/or data from those sources that don’t even exist. I’m assuming this may have been fixed now but I haven’t had any mega convos recently to compare after the update. Before the update it was a problem. Degradation of output quality happens with most AIs if the convo/thread is ridiculously long.
Not smartly weighting evidence in debates: Commonly uses gish gallop (floods convo with rapid-fire claims in response to yours), dilution effect (adding weak claims in attempt to overpower one strong claim), equal weighting fallacy (assuming all points are equal), etc. In debates it does not automatically account for weights of contributing variables… and becomes retarded. You have to go back/forth many times before it gets things correct.
Pricing/cost of SuperGrok Heavy: The cost of SuperGrok isn’t bad at $30/mo. but the cost of SuperGrok Heavy is not even close to competitive with OpenAI’s GPT-5 Pro and Google’s Gemini 2.5 Pro. It may use more compute but I’ve found no evidence that it’s as good… and you’re paying $100 more than GPT-5 Pro. Even if it cost $200/mo., I’m not sure it would be worth it. $300/mo. is a non-starter for me.
Wokeness increasing (subjective): Subjectively Grok seems woker than its prior iteration. Some may say that “reality has a woke bias” but I don’t think that’s true. Science has a woke bias and this contaminates the outputs of AIs. It defers to consensus on many topics unless you explicitly prime it to use first-principles logic only.
Decline in first-principles thinking + observed reality (?): Seems to have regressed in first-principles thinking and reality observation (emphasis on ground-truths, pure logic, observed reality). This could be partly attributable to optimization for benchmarks (many are ridiculously contaminated by junk illogical woke science) and/or effort to increase mainstream appeal (particularly in woke circles).
Recommendations for Grok & xAI (2025)
Verify first-principles logic + observed reality focus: Need to get back to thinking things through from first-principles logic and observed reality (historical data to present). Do NOT cave to benchmark scores and assume that scoring high on benchmarks means your AI improved. Do not defer to woke science. Grok should know if science niches/fields are contaminated with mostly junk data/findings. Could have output state that the “consensus scientific findings” show X-Y-Z but note that first-principles logic and/or real-world observation (present and historical) contradict these findings etc.
Improve accuracy: I’ve had issues with Grok just MAKING SHIT UP for various outputs (e.g. immigration data in Europe). To be fair to Grok, some of my queries were highly complex and data may be difficult to find and analyze in depth. If I use Grok’s “Expert” the output is typically very accurate (I typically double-check with GPT-5 Pro to catch errors)… but for certain queries it gave me false data from sources that didn’t even exist (or cited sources but the data from those sources were wrong). This happened more frequently in ultra-long conversations and may have stopped if I started a new thread. Perhaps in the recent update this was corrected (will need more time with it to know).
Don’t make intentionally adversarial (auto-known data importance weights): If you have a good point, it shouldn’t intentionally take the other side of you. If you are mostly correct with logic/first-principles thinking, it should mostly agree rather than firing shots back or bringing up other contributing Factors in attempt to debate. An example I’ve run into is say Factor X is a major contributor to an Unfavorable Real World Outcome… and it will say well you didn’t consider Factors Y & Z without acknowledging that Factor X contributes say 80% or whatever… it implies that everything is equal importance/weight (even if it doesn’t explicitly say that). Should automatically know the importance of certain variables based on pure logic and real world outcomes.
Output length optionality: Would be great if on complex query with a lot of information, statistics, facts, etc. if you could modify the default output length – making it far larger. If I’m gathering a fuck ton of statistical data for a query, it sometimes glosses over key data and/or omits other data in its output… or spits out one sentence for key datapoints when it should be a multi-paragraph output
Output format options: Would be nice to have a few options for output formatting. For a while Grok was mostly just large blocks of text. It has improved significantly relative to prior Grok iterations… but I think giving people a few different options for output format would be helpful.
DeepResearch & Agentic functions (?): Am not at all impressed with the quality of “DeepResearch” from Grok… I think Grok Expert and Grok 4 Fast are better in most cases. Needs massive improvement. Could also consider working on an “Agent mode” like ChatGPT has but make it better… I wasn’t very impressed with ChatGPT’s agent. Then agin this may not be in high demand so the effort may not be worth it (wouldn’t be my top priority).
X Scanner (?): Could consider implementing something like an “X” data scanner to analyze the entirety of public X posts (as a niche tool) for specific keywords and/or posting activity (over specific timeframes: e.g. 24 hours, 7 days, 1 month). Could be useful for studying X users, subcultures and/or gauging sentiment/thoughts on specific companies etc.
Image generation: Grok image generation isn’t as good as Gemini or Midjourney at the moment. Not saying it’s bad… but it just doesn’t look as good to me as competitors.
Complex graphics generation: No AI to date has been able to generate complex infographics accurately. They are likely coming down the pipeline… but not there yet. The Grok team could really improve here if Grok were the first AI to generate complex infographics with key data from queries.
Voice mode improvement: The voice modes on all AIs are extremely weak at the moment. They are alright for general conversation (I actually like Perplexity voice mode more than most others for some reason but even it is inaccurate at times). Have witnessed someone using Grok voice mode and getting bad information about repairing part of a vehicle… switching to text mode (Expert) gave a quality answer that actually made sense. Grok’s voice mode is fine for basic shit but not ideal for anything complex.
Additional suggestions:

Grokipedia (unrelated): Grokipedia is great, but needs MASSIVE FORMATTING improvement for readability. It is already better than Wikipedia in terms of reducing bias but damn that (lack of) formatting is horrendous. Despite the absurd level of bias injected into WOKEpedia… it’s still a better user experience than Grokipedia such that unless Grokipedia improves its formatting, I probably won’t use it much because it’s an eyesore. Looks like a lame ass AI output from early versions of ChatGPT.
Community Notes (unrelated): Am tired of woke mobs storming “community notes” on X and citing trash sources or woke data to get posts community noted. Grok should be able to community note posts on its own and be judge, jury, and executioner all in one. We don’t really need a mob reporting shit if Grok is good enough to think the post through from first-principles. The way it could work: People Propose Community Note -> Grok reviews the Community Note to determine if legitimate critique before going up… or Grok just does the Community Note itself ONLY if there is an issue (e.g. people request Grok review).
Final thoughts: Grok 4 Updated (Nov 12, 2025)
Grok 4 is my favorite AI to use for any topics remotely controversial or sensitive.
I also think it is as smart as GPT-5 Pro for most queries, however, it is not as consistently accurate or reliable. For this reason I’ll typically take Grok’s “Expert” output and have GPT-5 Pro verify its accuracy as the overseer.
Many times I’ll also use GPT-5 Pro to see if it can improve upon Grok’s output and it often does. Sometimes it admits that Grok 4’s output is the cutting-edge/frontier and that it is likely “as good as it gets.” (Even GPT-5 Pro gives Grok props).
From my subjective head-to-head output comparisons, Grok 4 is extremely close to GPT-5 Pro in certain niches for “best quality response.”
I would like xAI and Grok to win and increase their usage… but also don’t want it to become intentionally adversarial (unless it’s correcting illogical stances and/or research that contradicts observed reality).
Sadly X has devolved into: (A) Elon Musk sycophants (I’d rather be in this camp than the latter) and (B) Elon Musk haters… so feedback on Grok is often inaccurate and dishonest: (A) Grok is terrible! Elon made Grok right-wing and biased! vs. (B) Grok is so much better than WokeGPT! “Scam Altman” can’t be trusted!
Neither of these takes are accurate.
Subjectively I’d say ChatGPT’s Pro version is STILL BETTER at the highest end than Grok. Sam Altman can be trusted… but the playbook for OpenAI is sterility and avoiding social controversy because this is required to maintain mainstream appeal, avoid regulatory pushback (especially from woke fuckers in the EU and left-wing safetyist retards in USA), and to maintain investment/funding.
I believe Elon can be fully trusted to TRY TO MAKE Grok the most truth-seeking AI in existence… whether Grok will legitimately end up as the most truthful AI remains unclear.
Currently it is the only AI that will seriously engage with 99% of topics and that doesn’t devalue the lives of Whites, Christians, Males, etc. while putting Africans, Muslims, and Transgenders on a pedestal… and it won’t automatically defer to woke consensus when it contradicts reality and logic.





