OpenAI says it has evidence DeepSeek used its model to train competitor

747 points by timsuchanek 9 months ago

udev 9 months ago

Imnimo 9 months ago

I think there's two different things going on here:

"DeepSeek trained on our outputs and that's not fair because those outputs are ours, and you shouldn't take other peoples' data!" This is obviously extremely silly, because that's exactly how OpenAI got all of its training data in the first place - by scraping other peoples' data off the internet.

"DeepSeek trained on our outputs, and so their claims of replicating o1-level performance from scratch are not really true" This is at least plausibly a valid claim. The DeepSeek R1 paper shows that distillation is really powerful (e.g. they show Llama models get a huge boost by finetuning on R1 outputs), and if it were the case that DeepSeek were using a bunch of o1 outputs to train their model, that would legitimately cast doubt on the narrative of training efficiency. But that's a separate question from whether it's somehow unethical to use OpenAI's data the same way OpenAI uses everyone else's data.

riantogo 9 months ago

Why would it cast any doubt? If you can use o1 output to build a better R1. Then use R1 output to build a better X1... then a better X2.. XN, that just shows a method to create better systems for a fraction of the cost from where we stand. If it was that obvious OpenAI should have themselves done. But the disruptors did it. It hindsight it might sound obvious, but that is true for all innovations. It is all good stuff.
- Imnimo 9 months ago
  
  I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that", if it turned out that in order to train r1 in the first place, you had to have access to bunch of outputs from o1. In other words, you had to do the really expensive o1 training in the first place.
  (with the caveat that all we have right now are accusations that DeepSeek made use of OpenAI data - it might just as well turn out that DeepSeek really did work independently, and you really could have gotten o1-like performance with much less compute)
  - deepGem 9 months ago
    
    From the R1 paper
    In this study, we demonstrate that reasoning capabilities can be significantly improved through large-scale reinforcement learning (RL), even without using supervised fine-tuning (SFT) as a cold start. Furthermore, performance can be further enhanced with the inclusion of a small amount of cold-start data
    Is this cold start data what OpenAI is claiming their output ? If so what's the big deal ?
    
    Imnimo 9 months ago
    
    DeepSeek claims that the cold-start data is from DeepSeekV3, which is the model that has the $5.5M pricetag. If that data were actually the output of o1 (a model that had a much higher training cost, and its own RL post-training), that would significantly change the narrative of R1's development, and what's possible to build from scratch on a comparable training budget.
    
    TheGeminon 9 months ago
    
    In the paper DeepSeek just says they have ~800k responses that they used for the cold start data on R1, and are very vague about how they got it:
    > To collect such data, we have explored several approaches: using few-shot prompting with a long CoT as an example, directly prompting models to generate detailed answers with reflection and verification, gathering DeepSeek-R1-Zero outputs in a readable format, and refining the results through post-processing by human annotators.
    
    Imnimo 9 months ago
    
    My surface-level reading of these two sections is that the 800k samples come from R1-Zero (i.e. "the above RL training") and V3:
    >We curate reasoning prompts and generate reasoning trajectories by performing rejection sampling from the checkpoint from the above RL training. In the previous stage, we only included data that could be evaluated using rule-based rewards. However, in this stage, we expand the dataset by incorporating additional data, some of which use a generative reward model by feeding the ground-truth and model predictions into DeepSeek-V3 for judgment.
    >For non-reasoning data, such as writing, factual QA, self-cognition, and translation, we adopt the DeepSeek-V3 pipeline and reuse portions of the SFT dataset of DeepSeek-V3. For certain non-reasoning tasks, we call DeepSeek-V3 to generate a potential chain-of-thought before answering the question by prompting.
    The non-reasoning portion of the DeepSeek-V3 dataset is described as:
    >For non-reasoning data, such as creative writing, role-play, and simple question answering, we utilize DeepSeek-V2.5 to generate responses and enlist human annotators to verify the accuracy and correctness of the data.
    I think if we were to take them at their word on all this, it would imply there is no specific OpenAI data in their pipeline (other than perhaps their pretraining corpus containing some incidental ChatGPT outputs that are posted on the web). I guess it's unclear where they got the "reasoning prompts" and corresponding answers, so you could sneak in some OpenAI data there?
    
    deepGem 9 months ago
    
    That's what I am gathering as well. Where is OpenAI going to have substantial proof to claim that their outputs were used ?
    The reasoning prompts and answers for SFT from V3 you mean ? No idea. For that matter you have no idea where OpenAI got this data from either. If they open this can of worms, their can of worms will be opened as well.
    
    IAmGraydon 9 months ago
    
    >Where is OpenAI going to have substantial proof to claim that their outputs were used ?
    I assume in their API logs.
    
    rekttrader 9 months ago
    
    Shibboleths in output data
    
    joe_the_user 9 months ago
    
    It's like the claim "they showed anyone create a powerful from scratch" becomes "false yet true".
    Maybe they needed OpenAI for their process. But now that their model is open source, anyone can use that as their cold start and spend the same amount.
    "From scratch" is a moving target. No one who makes their model with massive data from the net is really doing anything from scratch.
    
    bmicraft 9 months ago
    
    Yeah, but that kills the implied hope of building a better model for cheaper. Like this you'll always have a ceiling of being a bit worse then the openai models.
    
    roenxi 9 months ago
    
    The logic doesn't exactly hold, it is like saying that a student is limited by their teachers. It is certainly possible that a bad teacher will hold the student back, but ultimately a student can lag or improve on the teacher without only a little extra stimulus.
    They probably would need some other source of truth than an existing model, but it isn't clear how much additional data is needed.
    
    reassess_blind 9 months ago
    
    Isn't DeepSeek a bit better, not worse?
    
    diedyesterday 9 months ago
    
    Don't forget that this model probably has far less params than o1 or even 4o. This is a compression/distillation, which means it frees up so much compute resources to build models much powerful than o1. At least this allows further scaling compute-wise (if not in the amount of, non-synthetic, source material available for training).
    
    Loic 9 months ago
    
    Not for me. As I build a chemical factory, I do not reinvent everything.
    They are using the current SOTA tools and models to build new models for cheaper.
    
    vlovich123 9 months ago
    
    If R1 were better than O1, yes you would be right. But the reporting I’ve seen is that it’s almost as good. Being able to copy cutting edge models won’t advance the state of the art in terms of intelligence. They have made improvements in other area, but if they reused O1 to train their model, that would be effectively a ctrl-c / ctrl-v strictly in terms of task performance.
    
    unclebucknasty 9 months ago
    
    It's not just about whether competitors can improve on OpenAI's models. It's about whether they can continually create reasonable substitutes for orders of magnitude less investment.
    
    vlovich123 9 months ago
    
    > It's about whether they can continually create reasonable substitutes for orders of magnitude less investment
    That just means that the edge you’re able to retain if you invest $1B is nonexistent. It also means there’s a huge disincentive to invest $1B if your reward instantly evaporates. That would normally be fine if the competitor is otherwise able to get to that new level without the $1B. But if it relies on your $1B to then be able to put in $100M in the first place to replicate your investment, it essentially means the market for improvements disappears OR there’s legislation written to ensure competitors aren’t allowed to do that.
    This is a tragedy of the commons and we already have historical example for how humans tried to deal with it and all the problems that come with it. The cost of producing a book requires substantial capital but the cost of copying it requires a lot less. Copyright law, however flawed and imperfect, tries to protect the incentive to create in the face of that.
    
    chefandy 9 months ago
    
    > That just means that the edge you’re able to retain if you invest $1B is nonexistent.
    Jeez. Must be really tough to have some comparatively small group of people financially destroy your industry with your own mechanically-harvested professional output while dubiously claiming to be be better than you when in reality it’s just a lot cheaper. Must be tough.
    Maybe they should take some time to self-reflect and make some art and writing about it using the products they make that mechanically harvest the work of millions of people, and have already screwed up the commercial art and writing marketplaces pretty throughly. Maybe tell DeepSeek it’s their therapist and get some emotional support and guidance.
    
    unclebucknasty 9 months ago
    
    This. There is something doubly evil about OpenAI harvesting all of that work for its own economic benefit, while also destroying the opportunity for those it stole from to continue to ply their craft.
    
    chefandy 9 months ago
    
    And then all of their stans taking on a persecution complex because people that actually made all of the “data” don’t uncritically accept their work as equivalent adds insult to injury.
    
    unclebucknasty 9 months ago
    
    >it essentially means the market for improvements disappears OR there’s legislation...
    This is possibly true, though with billions already invested I'm not sure that OpenAI would just...stop absent legislation. And, there may be technical or other solutions beyond legislation. [0]
    But, really, your comment here considers what might come next. OTOH, I was replying to your prior comment that seemed to imply that DeepSeek's achievement was of little consequence if they weren't improving on OpenAI's work. My reply was that simply approximating OpenAI's performance at much lower cost could still be extraordinarily consequential, if for no other reason than the challenges you subsequently outlined in this comment's parent.
    [0] On that note, I'm not sure (and admittedly haven't yet researched) how DeepSeek just wholesale ingested ChatGPT's "output" to be used for its own model's training, so not sure what technical measures might be available to prevent this going forward.
    
    RHSman2 9 months ago
    
    The value of intelligence is only when it is better than the rest. Unless you are Microsoft of course.
    
    PeterStuer 9 months ago
    
    Strong disagree. Copy/paste would mean they took o1's weights and started finetuning from there. That is ot what happened here at all.
    
    vlovich123 9 months ago
    
    First, there could have been industrial espionage involved so who knows. Ignoring that, you’re missing what I’m saying. Think of it this way - if it requires O1’s input to reach almost the same task performance, then this approach gives you a cheap way to replicate the performance of a leading edge model at a fraction of the cost. It does not give you a way to train something that beats a cutting edge model. Cutting edge models require a lot of R&D & capital expenditure - if they’re just going to be trivially copied after public availability, the response is going to be legislation to keep the incentive there to keep meaningful investments in that area. Otherwise you’re going to have another AI winter where progress shrivels because investment dollars dry up.
    That’s why it’s so hard to understand the true cost of training Deepseek whereas it’s a little bit easier for cutting edge models (& even then still difficult).
    
    throw234234234 9 months ago
    
    "Otherwise you’re going to have another AI winter where progress shrivels because investment dollars dry up."
    Tbh a lot of people in the world would love this outcome. They will use AI because not using it puts them at a comparative disadvantage - but would rather AI doesn't develop further or didn't develop at all (i.e. they don't value the absolute advantage/value). There's both good and bad reasons for this.
    
    RHSman2 9 months ago
    
    This.
    “Hey OpenAI, if you had to make a clone of yourself again how would you do it and for a lot cheaper?”
    Nice move.
    
    skinner_ 9 months ago
    
    When you build a new model, there is a spectrum of how you use the old model: 1. taking the weights, 2. training on the logits, 3. training on model output, 4. training from scratch. We don't know how much advantage #3 gives. It might be the case that with enough output from the old model, it is almost as useful as taking the weights.
    
    powerapple 9 months ago
    
    I lean on the idea that R1-Zero was trained from cold start, at the same time, they have tried many things including using OpenAI APIs. These things can happen in parallel.
  - manquer 9 months ago
    
    > you had to do the really expensive o1 training in the first place
    It is no better for OpenAI in this scenario either, any competitor can easily copy their expensive training without spending the same, i.e. there is a second mover advantage and no economic incentive to be the first one.
    To put it another way, the $500 Billion Stargate investment will be worth just $5Billion once the models become available for consumption, because it only will take that much to replicate the same outcomes with new techniques even if the cold start needed o1 output for RL.
    
    hattmall 9 months ago
    
    Shouldn't OpenAI be able to rather easily detect such usage?
    
    hmmm-i-wonder 9 months ago
    
    Now that its been done, is OpenAI needed or can you iterate on DeepSeek only moving forward?
    My understanding is this effectively builds on OpenAI's very expensive initial work, provides a "nearly as good as" model for orders of magnitude cheaper to train and run, that also provides a basis to continue building on and improving without openAI, and without human bottlenecks.
    That cuts OAI off at the knees in terms of market viability after billions have been spent. If DS can iterate and match the capabilities of the current in-development OAI models in the next year, it may come down to regulatory capture and government intervention to ensure its viability as a company.
    
    manquer 9 months ago
    
    You cannot really have government intervention against open source and weights successfully.
    the attempt in cryptography with PGP and export controls made that clear.
    Even if DS specifically is banned (and even effectively), a dozen other clean room replications following their published methods will become available.
    It is possible this government will ban all “unapproved” LLMs not running at authorized provider[1], saying it is weapon and AGI or skynet or whatever makes powers that sound important, thus establishing the need for control [2], the rest of the world will keep innovating.
    —-
    [1] Bans just need to work only economically, not at information level i.e organization with liability considerations will not use “unapproved” ones and they are ones who will bulk of the money and that what they need to protect.
    [2] if they were smart they could do this positively without the backlash bans would have. By giving protections to compliant models like legal indemnity for for model companies and users without necessarily blocking others
    
    hmmm-i-wonder 9 months ago
    
    I agree they can't really _successfully_ intervene, but I have very high expectations that they will attempt to in some manner.
  - MrLeap 9 months ago
    
    o1 wouldn't exist without the combined compute of every mind that led to the training data they used in the first place. How many h100 equivalents are the rolling continuum of all of human history?
    
    dchichkov 9 months ago
    
    It should be possible to learn to reason from scratch. And the ability to reason in a long context seems to be very general.
    
    Nevermark 9 months ago
    
    How does one learn reasoning from scratch?
    Human reasoning, as it exists today, is the result of tens of thousands of years of intuition slowly distilled down to efficient abstract concepts like "numbers", "zero", "angles", "cause", "effect", "energy", "true", "false", ...
    I don't know what reasoning from scratch would look like without training on examples from other reasoning beings. As human children do.
    
    dchichkov 9 months ago
    
    There are examples of learning reasoning from scratch with reinforcement learning.
    Emergent tool use from multi-agent interaction is a good example - https://openai.com/index/emergent-tool-use/
    
    ipaddr 9 months ago
    
    Now you are asking for a perfect modeling of the system. Reinforcement learning works by discovering boundaries.
    
    tracker1 9 months ago
    
    Now rediscover all the plants that are and aren't poisonous to most people.
    
    dchichkov 9 months ago
    
    I've suggested that long context should be included into the prompt.
    In your particular case the prompt would look something like: <pubmed dump> what are the plants that aren't poisonous to most people?
    A general reasoner would recover language and relevant world model from pubmed dump. And then would proceed to reason about it, to perform the task.
    It doesn't look like a particularly efficient process.
    
    Davidzheng 9 months ago
    
    Actually i also think it's possible. Start with natural numbers axiom system. Form all valid sentences of increasing length. RL on a model to search for counter example or proofs. This on sufficient computer should produce superhuman math performance (efficiency) even at compute parity
    
    MrLeap 9 months ago
    
    I wonder how much discovery in math happens as a result in lateral thinking epiphanies. IE: A mathematician is trying to solve a problem, their mind is open to inspiration, and something in nature, or their childhood or a book synthesizes with their mental model and gives them the next node in their mental graph that leads to a solution and advancement.
    In an axiomatic system, those solutions are checkable, but how discoverable are they when your search space starts from infinity? How much do you lose by disregarding the gritty reality and foam of human experience? It provides inspirational texture that helps mathematicians in the search at least.
    Reality is a massive corpus of cause and effect that can be modeled mathematically. I think you're throwing the baby out with the bathwater if you even want to be able to math in a vacuum. Maybe there is a self optimization spider that can crawl up the axioms and solve all of math. I think you'll find that you can generate new math infinitely, and reality grounds it and provides the gravity to direct efforts towards things that are useful, meaningful and interesting to us.
    
    soulofmischief 9 months ago
    
    As I mentioned in a sister comment, Gödel's incompleteness theorems also throw a wrench into things, because you will be able to construct logically consistent "truths" that may not actually exist in reality. At which point, your model of reality becomes decreasingly useful.
    At the end of the day, all theory must be empirically verified, and contextually useful reasoning simply cannot develop in a vacuum.
    
    staunton 9 months ago
    
    Those theorems are only relevant if "reasoning" is taken to its logical extreme (no pun intended). If reasoning is developed/trained/evolved purely in order to be useful and not pushed beyond practical applications, the question of "what might happen with arbitrarily long proofs" doesn't even come up.
    On the contrary, when reasoning about the real world, one must reason starting from assumptions that are uncertain (at best) or even "clearly wrong but still probably useful for this particular question" (at worst). Any long and logic-heavy proof would make the results highly dubious.
    
    danenania 9 months ago
    
    A question is: what algorithms does the brain use to make these creative lateral leaps? Are they replicable?
    Unless the brain is using physics that we don’t understand or can’t replicate, it seems that, at least theoretically, there should be a way to model what it’s doing with silicon and code.
    States like inspiration and creativity seem to correlate in an interesting way with ‘temperature’, ‘top p’, and other LLM inputs. By turning up the randomness and accepting a wider range of output, you get more nonsense, but you also potentially get more novel insights and connections. Human creativity seems to work in a somewhat similar way.
    
    kmeisthax 9 months ago
    
    https://en.wikipedia.org/wiki/Monstrous_moonshine#Origin_of_...
    
    iczero 9 months ago
    
    I believe https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_... (Gödel's incompleteness theorems) applies here
    
    hmmm-i-wonder 9 months ago
    
    Dogs are probably the best example I can think of. They learn through experience and clearly reason, but without a complex language to define abstract concepts. Its very basic reasoning, but they do learn and apply that learning.
    To your point, experience is the training. Without language/data to represent human experience and knowledge to train a model, how would you give it 'experience'?
    
    Nevermark 9 months ago
    
    And yet dogs, to a very high degree, just learn the same things. At least the same kinds of things, over and over.
    They were pre-designed to learn what they always learn. Their minds structured to readily make the same connections as puppies, that dogs have always needed to survive.
    Not for real reasoning, which by its nature, does not have a limit.
    
    hmmm-i-wonder 9 months ago
    
    > just learn the same things. At least the same kinds of things, over and over.
    Its easy to train the same things to a degree, but its amazing to watch different dogs individually learn and reason through things completely differently, even within a breed or even a litter.
    Reasoning ability is always limited by the capacity of the thinker to frame the concepts and interactions. Its always limited by definition, we only push that limit farther than other species, and AGI may eventually push it past our abilities.
    
    soerxpso 9 months ago
    
    There was necessarily a "first reasoning being" who learned reasoning from scratch, and then it's improved from there. Humans needed tens of thousands of years because:
    - humans experience reality at a slower pace than AI could theoretically experience a simulated reality
    - humans have to transfer knowledge to the next generation every 80 years (in a manner that's very lossy), and around half of each human lifespan is spent learning things that the previous generation already knew
    
    addicted 9 months ago
    
    The idea that there was “necessarily a first reasoning being” is neither obvious nor likely.
    Reasoning could very well have originally been an emergent property of a group of beings.
    The animal kingdom is full of examples of groups being more intelligent than individuals, including in human animals as of today.
    It’s entirely possible that reasoning emerged as a property of a group before it emerged in any individual first.
    
    carlob 9 months ago
    
    I think you are focusing too much on the fact that a being needs to be an individual organism, which is kind of an implementation detail.
    What I wonder instead is whether reasoning is a property that is either there or not there, with a sharp boundary of existence.
    
    butlike 9 months ago
    
    The dead organism cannot reason. It's simply a survivorship-bias. Reasoning evolved like any other survival mechanism.
    
    soerxpso 9 months ago
    
    Whether the first reasoning entity is an individual organism or a group of organisms is completely irrelevant to the original point. If one were to grant that there was in fact a "first reasoning group" rather than a "first reasoning being" the original argument would remain intact.
    
    butlike 9 months ago
    
    Did it kill them? y - must be unsafe n - must be safe
    Do this continually through generations until you arrive at modern society.
    
    MrLeap 9 months ago
    
    Creating reasoning from scratch is the same task as creating an apple pie from scratch.
    First you must invent the universe.
    
    psychoslave 9 months ago
    
    >First you must invent the universe.
    That was the easy part though, figuring out how to handle all the unintended side effects it generated is still an ongoing process. Please sit and relax while we are solving the few incidentals events occurring here and there, rest assured we are putting our best effort to their resolution.
    
    miki123211 9 months ago
    
    It is possible to learn to reason from scratch, that's what R1-0 did, but the resulting chains of thought aren't legible to humans.
    To quote DeepSeek directly:
    > DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL.
    
    dchichkov 9 months ago
    
    If you look at the benchmarks of the DeepSeek-V3-Base, it is quite capable, even in 0-shot: https://huggingface.co/deepseek-ai/DeepSeek-V3-Base#base-mod... This is not from scratch. These benchmark numbers are an indication that the base model already had a large number of reasoning/LLM tokens in the pre-training set.
    On the other hand, my take on it, the ability to do reasoning in a long context is a general capability. And my guess is that it can be bootstrapped from scratch, without having to do training on all of the internet or having to distill models trained on the internet.
    
    cma 9 months ago
    
    > These benchmark numbers are an indication that the base model already had a large number of reasoning/LLM tokens in the pre-training set.
    But we already know that is the case: the Deepseek v3 paper says it was posttrained partly with an internal version of R1:
    > Reasoning Data. For reasoning-related datasets, including those focused on mathematics, code competition problems, and logic puzzles, we generate the data by leveraging an internal DeepSeek-R1 model. Specifically, while the R1-generated data demonstrates strong accuracy, it suffers from issues such as overthinking, poor formatting, and excessive length. Our objective is to balance the high accuracy of R1-generated reasoning data and the clarity and conciseness of regularly formatted reasoning data.
    And deepseekmath did a repeated cycle of this kind of thing mixing in 10% of old previously seen data with new generated data from last gen in a continuous bootstrap.
    
    PeterStuer 9 months ago
    
    Possible? I guess evolution did it over the course of a few billion years. For engineering purposes, starting from the best advanced position seems far more efficient.
    
    soulofmischief 9 months ago
    
    I've been giving this a lot of thought over the last few months. My personal insight is that "reasoning" is simply the application of a probabilistic reasoning manifold on an input in order to transform it into constrained output that serves the stability or evolution of a system.
    This manifold is constructed via learning a decontextualized pattern space on a given set of inputs. Given the inherent probabilistic nature of sampling, true reasoning is expressed in terms of probabilities, not axioms. It may be possible to discover axioms by locating fixed points or attractors on the manifold, but ultimately you're looking at a probabilistic manifold constructed from your input set.
    But I don't think you can untie this "reasoning" from your input data. It's possible you will find "meta-reasoning", or similar structures found in any sufficiently advanced reasoning manifold, but these highly decontextualized structures might be entirely useless without proper recontextualization, necessitating that a reasoning manifold is trained on input whose patterns follow learnable underlying rules, if the manifold is to be useful for processing input of that kind.
    Decontextualization is learning, decomposing aspects of an input into context-agnostic relationships. But recontextualization is the other half of that, knowing how to take highly abstract, sometimes inexpressible, context-agnostic relationships and transform them into useful analysis in novel domains.
    This doesn't mean a well-trained model can't reason about input it hasn't encountered before, just that the input needs to be in some way causally connected to the same laws which governed the input the manifold was trained on.
    I'm sure we could create a fully generalized reasoning manifold which could handle anything, but I don't see how we possibly get that without first considering and encountering all possible inputs. But these inputs still have to have some form of constraint governed by laws that must be learned through sampling, otherwise you'd just be training on effectively random data.
    The other commenter who suggested simply generating all possible sentences and training on internal consistency should probably consider Gödel's incompleteness theorems, and that internal consistency isn't enough to accurately model and interpret the universe. One could construct a thought experiment about an isolated brain in a jar with effectively unlimited neuronal connections, but no sensory connection to the outside world. It's possible, with enough connections, that the likelihood of the brain conceiving of true events it hasn't actually encountered does increase meaningfully. But the brain still has nothing to validate against, and can't simply assume that because something is internally logically consistent, that it must exist or have existed.
  - vkou 9 months ago
    
    If OpenAi had to account for the cost of producing all the copyrighted material they trained their LLM on, their system would be worth negative trillions of dollars.
    Let's just assume that the cost of training can be externalized to other people for free.
    
    fakedang 9 months ago
    
    Even if what OpenAI asserts in the title of this post is true, then their system is worth negative trillions of dollars.
    If other players can access that data with relatively less effort, then it's futile trying to train your models and improve upon them, as clearly you don't have an architectural moat, just a training moat.
    Kind of like an office scene where an introverted hardworker does all the tedious work, while his extroverted colleague promotes it as his and gains credit.
  - hmottestad 9 months ago
    
    At the pace that DeepSeek is developing we should expect them to surpass OpenAI in not that long.
    The big question really is, are we doing it wrong, could we have created o1 for a fraction of the price. Will o4 cost less to train than o1 did?
    The second question is naturally. If we create a smarter LLM, can we use it to create another LLM that is even smarter?
    It would have been fantastic if DeepSeek could have come out with an o3 competitor before o3 even became publicly available. That way we would have known for sure that we’re doing it wrong. Cause then either we could have used o1 to train a better AI or we could have just trained in a smarter and cheaper way.
    
    pertymcpert 9 months ago
    
    The whole discussion is about whether or not the second case of using o1 outputs to fine tune R1 is what allowed R1 to become so good. If that's the case then your assertion that DeepSeek will surpass OpenAI doesn't really make sense because they're dependent on a frontier model in order to match, not surpass.
    
    hmottestad 9 months ago
    
    Yeah, that's my point. If they do end up surpassing OpenAI then it would seem likely that they aren't just relying on copying from o1, or whatever model is the frontier model at that time.
  - cherry_tree 9 months ago
    
    > I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that"
    Whether or not you could have, you can now.
  - SpaceManNabs 9 months ago
    
    My question is if deepseek r1 is just a distilled o1, i wonder if you can build a fine tuned r1 through distillation without having to fine tune o1.
  - zombiwoof 9 months ago
    
    Exactly. They piggybacked of lots of compute and used less. There still is a total sum of a massive amount of compute
    
    cratermoon 9 months ago
    
    OpenAI piggybacked on the whole internet and the catalogued and shared human knowledge therein.
    
    fmbb 9 months ago
    
    That’s a lot of watt hours!
    
    PeterStuer 9 months ago
    
    And lets not forget a gazillion hours of human reinforcement by armies of 3rd world mechanical turks.
    
    bitfilped 9 months ago
    
    Except OpenAI hasn't shared anything.
    
    TeMPOraL 9 months ago
    
    Sure. This is fine. Data is still a product, no matter how much businesses would like to turn it into a service.
    The model already embodies the "total sum of a massive amount of compute" used to create it; if it's possible to reuse that embodied compute to create a better model, that's good for the world. Forcing everyone to redo all that compute for themselves is, conversely, bad for the world.
    
    RHSman2 9 months ago
    
    Nothing good for the world in this ai race but your comment is very good.
    
    da_chicken 9 months ago
    
    I mean, yes that's how progress works. Has OpenAI got a patent? If not it's fair game.
    We don't make people figure out how to domesticate a cow every time they want a hamburger. Or test hundreds of thousands of filaments before they can have a lightbulb. Inventions, once invented, exist as giants to stand upon. The inventor can either choose to disclose the invention and earn a patent for exclusive rights, or they can try to keep it a secret and hope nobody reverse engineers it.
  - philipwhiuk 9 months ago
    
    You mean to create an apple pie from scratch you first have to invent the universe?
- rockemsockem 9 months ago
  
  I think the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI. Even though in the paper they give a lot of credit to Llama for their techniques. The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.
  All of this should have been clear anyway from the start, but that's the Internet for you.
  - joe_the_user 9 months ago
    
    The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.
    Hmm, I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model, the human isn't necessary.
    As far as I know, DeepSeek adds only a little to the transformers model while o1/o3 added a special "reasoning component" - if DeepSeek is as good as o1/o3, even taking data from it, then it seems the reasoning component isn't needed.
    
    david-gpu 9 months ago
    
    > I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model
    Distillation is a term of art in AI and it is fundamentally incorrect to talk about distilling human-created data. Only an AI model can be distilled.
    https://en.m.wikipedia.org/wiki/Knowledge_distillation#Metho...
    
    joe_the_user 9 months ago
    
    Meh,
    It seems clear that the term can be used informally to denote the boiling down of human knowledge, indeed it was used that way before AI appeared in the popular imagination.
    
    david-gpu 9 months ago
    
    In the context in which you said it, it matters a lot.
    >> The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.
    > Hmm, I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model, the human isn't necessary.
    If deepseek was produced through the distillation (term of art) of o1, then the cost of producing deepseek is strictly higher than the cost of producing o1, and can't be avoided.
    Continuing this argument, if the premise is true then deepseek can't be significantly improved without first producing a very expensive hypothetical o1-next model from which to distill better knowledge.
    That is the argument that is being made. Please avoid shallow dismissals.
    Edit: just to be clear, I doubt that deepseek was produced via distillation (term of art) of o1, since that would require access to o1's weights. It may have used some of o1's outputs to fine tune the model, which still would mean that the cost of training deepseek is strictly higher than training o1.
    
    joe_the_user 9 months ago
    
    just to be clear, I doubt that deepseek was produced via distillation
    Yeah, your technical point is kind of ridiculous here that in all my uses of distillation (and in the comment I quoted), distillation is used in informal sense and there's no allegation that DeepSeek could have been in possession of OpenAI's model weights, which is what's needed for your "Distillation (term of Art)".
    
    ada1981 9 months ago
    
    I’m not sure why folks don’t speculate China is able to obtain copies of OpenAI's weights.
    Seems reasonable they would be investing heavily in plaing state assets within OpenAI so they can copy the models.
    
    joe_the_user 9 months ago
    
    Because it feeds conspiracy theories and because there's no evidence for it? Also, let's talk DeepSeek in particular, not "China".
    Looking back on the article, it is indeed using "distillation" as a special/"term of art" but not using it correctly. IE, it's not actually speculating that DeepSeek obtained OpenAI's weights and distilled them down but rather that it used OpenAI's answers/output as a starting point (which there is a different method/"term of art").
    
    PontifexCipher 9 months ago
    
    Some info that may be missing:
    - v2/v3 (not r1) seem to be cloned from o1/4o output, and perform worse (this cost the oft-repeated 5ish mm USD)
    - r1 is specifically a reasoning step (using RL) _on top of_ v2/v3 and performs similarly to o1 (the cost of this is _not reported anywhere_)
    - In the o1 blog post, they specifically say they use RL to add reasoning to LLMs: https://openai.com/index/learning-to-reason-with-llms/
    
    sudosysgen 9 months ago
    
    The R1-Zero paper shows how many training steps the RL took, and it's not many. The cost of the RL is likely a small fraction of the cost of the foundational model.
  - aprilthird2021 9 months ago
    
    > the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI
    I did not think this, nor did I think this was what others assumed. The narrative, I thought, was that there is little point in paying OpenAI for LLM usage when a much cheaper, similar / better version can be made and used for a fraction of the cost (whether it's on the back of existing LLM research doesn't factor in)
    
    TheGRS 9 months ago
    
    Yes, well the narrative that rocked the stock market is different. Its looking at what DeepSeek did and assuming they may have competitive advantage in this space and could outperform OpenAI at their own game.
    If the narrative is actually that DeepSeek can only reach whatever heights OpenAI has already gotten to with some new tricks, then markets will probably refocus on OpenAI's innovations and price things accordingly, even if the initial cost is huge. It also means OpenAI probably needs a better moat to protect its interests.
    I'm not sure where the reality is exactly, but market reactions so far have basically followed that initial narrative and now the rebuttal.
    
    addicted 9 months ago
    
    The idea that someone can easily replicate an OpenAI model based simply on OpenAI outputs is, I’d argue, immeasurably worse for OpenAI’s valuation than the idea that someone happened to come up with a few innovations that leapfrogged OpenAI.
    The latter could be a one time thing, and/or OpenAi Could still use their financial might to leverage those innovations and get even better with them.
    However, the former destroys their business model and no amount of intelligence and innovation from OpenAI protects them from being copied at a fraction of the cost.
    
    aprilthird2021 9 months ago
    
    > Yes, well the narrative that rocked the stock market is different.
    How do you know this?
    > If the narrative is actually that DeepSeek can only reach whatever heights OpenAI has already gotten to with some new tricks, then markets will probably refocus on OpenAI's innovations and price things accordingly
    Why? If every innovation OpenAI is trying to keep as secret sauce becomes commoditized quickly and cheaply, then why would markets care about any innovations they have? They will be unable to monetize them.
    
    davrosthedalek 9 months ago
    
    Couldn't OpenAI just put in their license that training off OpenAi output is not allowed? With shibboleth or API logs, this could be verifiable.
    
    aprilthird2021 9 months ago
    
    Why would it matter when Chinese deepseek is not going to abide by such rules or be forced to and will release their model open weights so anyone anywhere can host it?
    Also, scraping most of the websites they scrape is also not allowed, they do it anyways
    
    davrosthedalek 9 months ago
    
    If they can make the US and Europe block the use of Deepseek and derivatives, they would be able to protect most of their market.
    
    paul_e_warner 9 months ago
    
    There were different narratives for different people. When I heard about r1, my first response was to dig into their paper and it's references to figure out how they did it.
    
    kelnos 9 months ago
    
    > I did not think this, nor did I think this was what others assumed.
    That's what I thought and assumed. This is the narrative that's been running through all the major news outlets.
    It didn't even occur to me that DeepSeek could have been training their models using the output of other models until reading this article.
    
    bigfudge 9 months ago
    
    Fwiw I assumed they were using o1 to train. But it doesn’t matter: the big story here is that massive compute resources are unlikely to be as important in the future as we thought. It cuts the legs off stargate etc just as it’s announced. The CCP must be highly entertained by the timeline.
    
    aiono 9 months ago
    
    That's only the case if you don't need to use the output of a much more expensive model.
  - hmmm-i-wonder 9 months ago
    
    >shows that models like o1 are necessary.
    But HOW they are necessary is the change. They went from building blocks to stepping stones. From a business standpoint that's very damaging to OAI and other players.
- KingOfCoders 9 months ago
  
  OpenAI couldn't do it, when the high cost of training and access to GPUs is their competitive advance against startups, they can't admit that it does not exist.
- patcon 9 months ago
  
  Are we it rediscovering the evolutionary benefit of progeny (from an information theoretic lens)?
  And is this related to the lottery ticket hypothesis?
  https://arxiv.org/pdf/1803.03635.pdf
- herodoturtle 9 months ago
  
  Thanks for the insightful comment.
  I have a question (disclaimer: reinforcement learning noob here):
  Is there a risk of broken telephone with this?
  Kinda like repeatedly compressing an already compressed image eventually leads to a fuzzy blur.
  If that is the case then I’m curious how this is monitored and / or mitigated.
- ospray 9 months ago
  
  They did do that themselves it's called o3.
- RHSman2 9 months ago
  
  When will over training happen on the melange of models at scale? And will AGI only ever be an extension of this concept?
  That is where artificial intelligence is going. Copy things from other things. Will there be a AI Eureka moment where it deviates and knows where and why the reason it is wrong?
- indymike 9 months ago
  
  Bad things happen in tech when you don't do the disrupting yourself.
- anothernewdude 9 months ago
  
  If they're training R1 on o1 output on the benchmarks - then I don't trust those benchmarks results for R1. It means the model is liable to be brittle, and they need to prove otherwise.
- dontreact 9 months ago
  
  Is there any evidence R1 is better than O1?
  It seems like if they in fact distilled then what we have found is that you can create a worse copy of the model for ~5m dollars in compute by training on its outputs.
- iforgot22 9 months ago
  
  "Then use R1 output to build a better X1" is the part I'm not sure about. Is X1 going to actually be better than R1?
- qwertox 9 months ago
  
  They're standing on the shoulders of giants, not only in terms of re-using expensive computing power almost for free by using the outputs of expensive models. It's a bit of a tradition in that country, also in manufacturing.
  - unreal37 9 months ago
    
    I thought OpenAI GPT took Wikipedia and the content of every book as inputs to train their models?
    Everyone is standing on the shoulders of giants.
    
    qwertox 9 months ago
    
    What I meant to say was that OpenAI did put a lot of money into extracting value out of the pile of (partially copyrighted) data, and that DeepSeek was freeloading on that investment without disclosing it, making them look more efficient than they truly are.
  - bigfudge 9 months ago
    
    How do you think manufacturing in the US got started? Everyone is on someone’s shoulders.
- dartos 9 months ago
  
  What does “better” really even mean here?
  Better benchmark scores can be cooked
- Sophira 9 months ago
  
  Honestly, it's kind of silly that this technology is in the hands of companies whose only aim is to make money, IMO.
  - lenerdenator 9 months ago
    
    Well, originally, OpenAI wasn't supposed to be that kind of organization.
    But if you leave someone in the tech industry of SV/SF long enough, they'll start to get high on their own supply and think they're entitled to insane amounts of value, so...
  - goatlover 9 months ago
    
    It's because they're the ones who could raise the money to make those models. Academics don't have access to that kind of compute. But the free models exist.
- gmd63 9 months ago
  
  Why not just copy and paste the model and change the name? That's an even more efficient form of distillation.
  - wgjordan 9 months ago
    
    Even assuming the model was somehow publicly available in a form that could be directly copied, that would be a more blatant form of copyright infringement. Distillation launders copyrighted material in a way that OpenAI specifically has argued falls under fair use.
PeterStuer 9 months ago

Ironically Deepseek is doing what OpenAI originally pledged to do. Making the model open and free is a gift to humanity.
Look at the whole AI revolution that Meta and others have bootstrapped by opening their models. Meanwhile OpenAI/Microsoft, Antropic, Google and the rest are just trying to look after number 1 while trying to regulatory capture an AI for me but not for thee outcome of full control.
- curt15 9 months ago
  
  Is there anything still "open" about OpenAI these days?
  - iamleppert 9 months ago
    
    I hear Sam is pretty open in his relationship.
    
    johnisgood 9 months ago
    
    Lmfao
  - sloucher 9 months ago
    
    The bow doors?
    https://en.wikipedia.org/wiki/MS_Herald_of_Free_Enterprise
  - oakpond 9 months ago
    
    You don't understand, "open" stands for "open your wallet."
  - balder1991 9 months ago
    
    Or another question, do they still publish any research that’s relevant for the field nowadays?
    
    awestroke 9 months ago
    
    No. They publish PDFs that hype up their models, but they do not publish anything even resembling a high-level overview of model architecture
    
    jacobgorm 9 months ago
    
    Given that you can download and use the weights, the model architecture has to be includded as part of that. And I did read a paper from them recently describing their MoE architecture and how it differs from the original GShard.
    
    awestroke 9 months ago
    
    Excuse me? What weights can you download from OpenAI? gpt2 does not count
    
    jacobgorm 9 months ago
    
    Sorry I meant that DeepSeek release their models. Wrong context.
- jajko 9 months ago
  
  I don't think it makes sense to look at some previous PR statements of Altman et al re this when there a tens of billions floating around and egos get inflated to moon sizes. Farts in the wind have more weight, but this goes for all corporate PR.
  Thieves yelling 'stop those thieves' scenario to me, they just were first and would not like losing that position. But its all about money and consequently power, business as usual.
  - sillyfluke 9 months ago
    
    There seems to a rare moderation error by dang with respect to this thread.
    The comments were moved here by dang from an flagged article with an editorialized /clickbait title. That flagged post has 1300 points at the time of writing.
    https://news.ycombinator.com/item?id=42865527
    1.
    It should be incumbent on the moderator to at least consider that the motivation for the points and comments may have been because many thought the "hypocrisy" of OpenAI's position was a more important issue than OpenAI's actual claim of DeepSeek violating its ToS. Moving the comments to an article that buries the potential hypocrisy issue that may have driven the original points and comments is not ideal.
    2.
    This article is from FT, which has a content license deal with OpenAI. To move the comments to an article from a company that has a conflict of interest due to its commercial relations with the YC company in question is problematic here especially since dang often states they try to more hands-off on moderation when the article is about a YC company.
    3.
    There is a link by dang to this thread from the original thread, but there should also be a link by dang to the original thread from here as well. Why is this not the case?
    4.
    Ideally, dang should have asked for a more substantial submission that prioritized the hypocrisy point to better match the spirit of the original post instead of moving the comments to this article.
    
    seaal 9 months ago
    
    One of the few times I’ve disagreed with dang’s moderation, truly obnoxious to try and find a conversation you checked on previously.
  - handsclean 9 months ago
    
    Yes, but we were duped at the time, so it’s right and good that we maintain light on and anger at the ongoing manipulation, in the hope of next time recognizing it as it happens, not after they’ve used us, screwed us, and walked away with a vast fortune.
  - jeanlucas 9 months ago
    
    But it makes sense to expose their blatantly lies whenever possible to diminish the credibility they are trying to build while accusing others of the same they did
    
    jajko 9 months ago
    
    Oh yes I agree with all of you that lies should be exposed, also who lies like that once will lie again, 0 doubt there.
    Just don't set the expectations bar too high to start with is all I am saying. Folks that get so high up money and power wise aren't nice people, period. Even if nice normal guy without any sociopathic traits would suddenly shoot so high, the environment and pressures would deform them pretty quickly.
    Also, I would consider only some leaked private conversations with close people as representative truth, not some PR statements carefully crafted by team of experts.
    Happy to be proven wrong, still waiting for an example #1 to give me some hope.
miki123211 9 months ago

> This is obviously extremely silly, because that's exactly how OpenAI got all of its training data
IANAL, but It is worth noting here that DeepSeek has explicitly consented to a license that doesn't allow them to do this. That is a condition of using the Chat GPT and the OpenAI API.
Even if the courts affirm that there's a fair use defence for AI training, DeepSeek may still be in the wrong here, not because of copyright infringement, but because of a breach of contract.
I don't think OpenAI would have much of a problem if you train your model on data scraped from the internet, some of which incidentally ends up being generated by Chat GPT.
Compare this to training AI models on Kindle Books randomly scraped off the internet, versus making a Kindle account, agreeing to the Kindle ToS, buying some books, breaking Amazon's DRM and then training your AI on that. What DeepSeek did is more analogous to the latter than the former.
- anon373839 9 months ago
  
  > DeepSeek has explicitly consented to a license that doesn't allow them to do this.
  You actually don’t know this. Even if it were true that they used OpenAI outputs (and I’m very doubtful) it’s not necessary to sign an agreement with OpenAI to get API outputs. You simply acquire them from an intermediary, so that you have no contractual relationship with OpenAI to begin with.
  - shishy 9 months ago
    
    I figured those contracts with an intermediary would extend to anyone they re-sell to, or prohibit them from re-selling...
    
    fdsjgfklsfd 9 months ago
    
    You are free to publish your conversations with ChatGPT on the Internet, where they can be picked up by scrapers. US ruled that they are not covered by copyright...
- krust 9 months ago
  
  >IANAL, but It is worth noting here that DeepSeek has explicitly consented to a license that doesn't allow them to do this. That is a condition of using the Chat GPT and the OpenAI API.
  I have some news for you
- dmitrygr 9 months ago
  
  > DeepSeek has explicitly consented to a license that doesn't allow them to do this.
  By existing in USA, OpenAI consented to comply with copyright law, and how did that go?
- blibble 9 months ago
  
  training is either fair use, or it isn't
  OpenAI can't have it both ways
  - chefandy 9 months ago
    
    Right, but it was never about doing the right thing for humanity, it was about doing the right thing for their profits.
    Like I’ve said time and time again, nobody in this space gives a fuck about anyone that isn’t directly contributing money to their bottom line at that particular instant. The fundamental idea is selfish, damages the fundamental machinery that makes the internet useful by penalizing people that actually make things, and will never, ever do anything for the greater good if it even stands a chance of reducing their standing in this ridiculously overhyped market. Giving people free access to what is for all intents and purposes a black box is not “open” anything, is no more free (as in speech) than Slack is, and all of this is obviously them selling a product at a huge loss to put competing media out of business and grab market share.
  - miki123211 9 months ago
    
    The issue here is breach of contract, not copyright.
    
    glooglork 9 months ago
    
    It's quite unlikely that OpenAI didn't break any TOS with all the data they used for training their models. Not just OpenAI but all companies that are developing LLMs.
    IMO, it would look bad for OpenAI to push strongly with this story, it would look like they're losing the technological edge and are now looking for other ways to make sure they remain on top.
    
    boppo1 9 months ago
    
    Interesting that Trump signalled positively for deepseek. Said something like 'american companies need to wake up'. Has Sam not paid the piper yet?
    
    staticman2 9 months ago
    
    Similar to how a patent contract becomes void when a patent expires regardless of what the terms of the contract says, it's not clear to me OpenAI can enforce a contract provision for an API output they own no copyright in.
    Since they have no intellectual property rights in the output, it's not clear to me they have a cause of action to sue over how the output is used.
    I wonder if any lawyers have written about this topic.
    
    prmoustache 9 months ago
    
    What makes you think they had a contract with them in the first place? You can use openAI through intermediaries/proxies.
    
    WolfRazu 9 months ago
    
    I assume all those intermediaries have to pass on the same ToS to their customers otherwise that seems like a very unusual move.
    
    fdsjgfklsfd 9 months ago
    
    How many thousands or millions of contracts has OpenAI breached by scraping data off of websites that have terms of service explicitly saying not to scrape data off their websites?
  - avs733 9 months ago
    
    They can sure try though, and I would be damned surprise if this wasn’t related to Sam’s event with trump last week.
  - windexh8er 9 months ago
    
    "Free for me, not for thee!" - Sam Altman /s
    But in all reality I'm happy to see this day. The fact that OpenAI ripped off everyone and everything they could and, to this day pretend like they didn't, is fantastic.
    Sam Altman is a con and it's not surprising that given all the positive press DeepSeek got that it was a full court assault on them within 48 hours.
- freen 9 months ago
  
  Did OpenAI abide by my service’s terms of service when it ingested my data?
  - cortesoft 9 months ago
    
    Did OpenAI have to sign up for your service to gain access?
    
    lolinder 9 months ago
    
    It probably ignored hundreds of thousands of "by using this site you consent to our Terms and Conditions" notices, many of which probably would be read as prohibiting training. But that's also a great example of why these implicit contracts don't really work as contracts.
    
    otherme123 9 months ago
    
    OpenAI scrapped my blog so aggressively that I had to ban their IPs. They ignored the robots.txt (which is kind of ToS) by 2 orders of magnitude, they ignored the explicit ToS that I copypasted blindly from somewhere but turns out it forbids what they did (something like you can't make money with the content). Not that I'm going to enforce it, but they should at least shut up.
    
    freen 9 months ago
    
    Civil law is only available to deep pockets.
    Contracts are enforceable to the degree to which you can pay lawyers to enforce them.
    I will run out of money trying to enforce my terms of service against openAI, while they have a massive war chest to enforce theirs.
    Ain’t libertarianism great?
    
    blibble 9 months ago
    
    solution: live in a country OpenAI can't get to you
    e.g China
    
    staunton 9 months ago
    
    Are you suggesting it's easier to successfully sue OpenAI for copyright infringement if you live in China?
    
    qup 9 months ago
    
    No, they're suggesting that deepseek avoids getting sued by openAI
    
    bayindirh 9 months ago
    
    No, but some of the data is licensed.
    For example, my digital garden is under GFDL, and my blog is CC BY-NC-SA. IOW, They can't remix my digital garden with any other license than GFDL, and they have to credit me if they remix my blog, and can't use it for any commercial endeavor, which OpenAI certainly does now.
    So, by scraping my webpages, they agree to my licensing of my data. So they're de-facto breaching my licenses, but they cry "fair-use".
    If I tell that they're breaching the license terms, they'd laugh at me, and maybe give me 2 cents of API access to mock me further. When somebody allegedly uses their API with their unenforcable ToS, they scream like an agitated cuckatoo (which is an insult to the cuckatoo, BTW. They're devilishly intelligent birds).
    Drinking their own poison was mildly painful, I guess...
    BTW, I don't believe that Deepseek has copied/used OpenAI models' outputs or training data to train theirs, even if they did, "the cat is out of the bag", "they did something amazing so they needed no permissions", "they moved fast and broke things", and "all is fair-use because it's just research" regardless of how they did it.
    Heh.
    
    Ukv 9 months ago
    
    > So, by scraping my webpages, they agree to my licensing of my data.
    If the fair use defense holds up, they didn't need a license to scrape your webpage. A contract should still apply if you only showed your content to people who've agreed to it.
    > and "all is fair-use because it's just research"
    Fair use is a defense to copyright infringement, not breach of contract. You can use contracts, like NDAs, to protect even non-copyright-eligible information.
    Morally I'd prefer what DeepSeek allegedly did to be legal, but to my understanding there is a good chance that OpenAI is found legally in the right on both sides.
    
    bayindirh 9 months ago
    
    At this point, what I'm afraid is the justice system will be just an instrument in this all Us vs. Them debate, so their decisions will not be bound by law or legality.
    Speculations aside, from what I understood, something like this shouldn't hold a drop of water under fair-use doctrine, because there's a disproportional damage, plus a huge monopolistic monetary gain because of what they did and how they did.
    On the other hand, I don't believe that Deepseek used OpenAI (in any capacity or way or method) to develop their models, but again, it doesn't matter how they did it in this current conjecture.
    What they successfully did was to upset a bunch of high level people, regardless of the technical things they achieved.
    IMHO, AI war has similar dynamics to MAD. The best way is not to play, but we are past the Rubicon now. Future looks dirty.
    
    Ukv 9 months ago
    
    > from what I understood, something like this shouldn't hold a drop of water under fair-use doctrine, because there's a disproportional damage, plus a huge monopolistic monetary gain
    "Something like this" as in what DeepSeek allegedly did, or the web-scraping done by both of them?
    For what DeepSeek allegedly did, OpenAI wouldn't have a copyright infringement case against them because the US copyright office determined that AI-generated content is not protected by copyright - and so there's no need here for DeepSeek to invoke fair use. It'll instead be down to whether they agreed to and breached OpenAI's contract.
    For the web-scraping it's more complicated. Fair use is determined by the weighing of multiple factors - commercial use and market impact are considered, but do not alone preclude a fair use defense. Machine learning models do seem, at least to me, highly transformative - and "the more transformative the new work, the less will be the significance of other factors".
    Additionally, since the market impact factor is the effect of the use of the copyrighted work on the market for that work, I'd say there's a reasonable chance it does not actually include what you may expect it to. For instance if you're a translator suing Google Translate for being trained on your translated book, the impact may not be "how much the existence of Google Translate reduced my future job prospects" nor even "how many fewer people paid for my translated book because of the existence of Google Translate" but rather "how many fewer people paid for my translated book than would have had that book been included in the training data" - which is likely very minor.
    
    addicted 9 months ago
    
    They probably did to access the NYTimes articles.
    
    outside1234 9 months ago
    
    That isn't required to be in violation of copyright
    
    freen 9 months ago
    
    Actually, yes, they actively agreed to them. Clicked the button and everything.
    
    baq 9 months ago
    
    Have their scraping bots consented to cookies?
    
    thorncorona 9 months ago
    
    Can you steal someone else’s laptop if they stood up to get a drink?
    
    addicted 9 months ago
    
    OpenAI itself has argued, to the degree that your analogy applies, that if the goal of stealing the laptop is to train AI then the answer is Yes.
    
    cortesoft 9 months ago
    
    Wouldn't this analogy be more like, "can you read my laptop screen if I stood up to get a drink?"
    
    freen 9 months ago
    
    And steal the ip from your startup and then go public.
    
    gizajob 9 months ago
    
    If their OS is open to the internet and you can scrape it and copy it off while they’re gone, then that would be about the right analogy. And OpenAi and DeepSeek have done the same thing in that case.
    
    secstate 9 months ago
    
    Yes, if you can pay off any witnesses.
    
    rpastuszak 9 months ago
    
    What?
- dartos 9 months ago
  
  TOS are not contracts.
  - lolinder 9 months ago
    
    Citation? My understanding was that they are provided that someone has to affirmatively accept them in order to use your site. So Terms of Service stuck at the bottom in the footer likely would not count as a contract because there's no consent, but Terms of Service included in a check box on a login form likely would count.
    But IANAL, so if you have a citation that says otherwise I'd be happy to see it!
    
    addicted 9 months ago
    
    You don’t need a citation.
    You just need to read OpenAI’s arguments about why TOS and copyright laws don’t apply to them when they’re training on other people’s copyrighted and TOS protected data and running roughshod over every legal protection.
    
    xdennis 9 months ago
    
    IANAGL, but in Germany a ToS is not a contract and can be declared void if it's deemed by courts to be unfair.
    
    vanviegen 9 months ago
    
    Yes, though this is especially true when it's consumers 'agreeing' to the TOS. Anything even somewhat surprising within such a TOS is basically thrown out the window in European courtrooms without a second look.
    For actual, legally binding consent, you'll need to make some real effort to make sure the consumer understands what they are agreeing to.
  - Spooky23 9 months ago
    
    People here will argue that. But the Chinese DNGAF.
- like_any_other 9 months ago
  
  Legally, I understand your point, but morally, I find it repellent that a breach of contract (especially terms-of-service) could be considered more important than a breach of law. Especially since simply existing in modern society requires us to "agree" to dozens of such "contracts" daily.
  I hope voters and governments put a long-overdue stop to this cancer of contract-maximalism that has given us such benefits as mandatory arbitration, anti-benchmarking, general circumvention of consumer rights, or, in this case, blatantly anti-competitive terms, by effectively banning reverse-engineering (i.e. examining how something works, i.e. mandating that we live in ignorance).
  Because if they don't, laws will slowly become irrelevant, and our lives governed by one-sided contracts.
- anothernewdude 9 months ago
  
  It's not hard to get someone else to submit queries and post the results, without agreeing to the license.
- katamari-damacy 9 months ago
  
  [dead]
tempeler 9 months ago

On another subject, if it belongs to OpenAI because it uses OpenAI, then doesn't that mean that everything produced using OpenAI belongs to OpenAI? Isn't that a reason not to use OpenAI? It's very similar to saying that you used Google and searched; now this product belongs to Google. They couldn't figure out how to respond; they went crazy.
- dathinab 9 months ago
  
  The US ruled that AI produced things are by themself not copyrightable.
  So no, it doesn't belong to OpenAI.
  You might be able to sue for penalties for breach of contract of the TOS, but that doesn't give them the right to the model. And even if it doesn't give them any right to invalidate unbound copyright grants they have given to 3rd parties (here literally everyone). Nor does it prevent anyone from training their own new models based on it or prevent anyone from using it. Oh, and the one breaching the TOS might not even have been the company behind DeepSeek but some in-between 3rd party.
  Naturally this is under a few assumptions:
  - the US consistently applies it's own law, but they have a long history of not doing so
  - the US doesn't abuse their power to force their economical opinions (ban DeepSeek) on other countries
  - it actually was trained on OpenAI, but uh, OpenAI has IMHO shown over the years very clearly that they can't be trusted and they are fully in-transparent. How do we trust their claim? How do we trust them to not retrospectively have tweaked their model to make it look as if DeepSeek copied it?
  - protocolture 9 months ago
    
    >The US ruled that AI produced things are by themself not copyrightable.
    The US ruled that the AI cannot be the author, that doesn't lead like so many clickbait articles suggest, that no AI products can be copyrighted.
    1 Activist tried to get the US copyright office to acknowledge his LLM as the author, who would then provide him a license to the work.
    There was no issue with himself being the original author and copyright holder of the AI works. But thats not what was being challenged.
    
    Aloisius 9 months ago
    
    The copyright office ruled AI output is uncopyrightable without sufficient human contribution to expression.
    Prompts, they said, were unlikely enough to satisfy the requirement of a human controlling the expressive elements thus most AI output today is probably not copyrightable.
    https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...
    
    protocolture 9 months ago
    
    >The Office concludes that, given current generally available technology, prompts alone do not provide sufficient human control to make users of an AI system the authors of the output.
    Prompts alone.
    But there are almost no cases of "Prompts Alone" products seeking copyright.
    Even what 3-4 years ago?, AI tools moved into a collaborative footing. Novel AI forces a collaborative process (and gives you output that can demonstrate your input which is nice). ChatGPT effectively forces it due to limited memory.
    There was a case, posted here to ycombinator, where a chinese judge upheld "significant" human interaction was involved when a user made 20-odd adjustments to their prompt iterating over produced images and then added a watermark to the result. I would be very surprised if most sensible jurisdictions didn't follow suit.
    Midjourney and ChatGPT already include tools to mask and identify parts of the image to be regenerated. And multiple image generators allow dumb stuff like stick figures and so forth to stand in as part of an uploaded image prompt.
    And then theres AI voice which is another whole bag of tricks.
    >thus most AI output today is probably not copyrightable.
    Unless it was worked on even slightly as above. In fact it would be hard to imagine much AI work that isn't copyrightable. Maybe those facebook pages that just prompt "Cyberpunk Girl" and spit out endless variations. But I doubt copyright is at the forefront of their mind.
    
    Aloisius 9 months ago
    
    A person collaborating on output almost certainly still would still not qualify as substantive contributions to expression in the US.
    The US copyright's determination was based on the simple analogy of someone hiring someone else to create a work for them. The person hiring, even if they offer suggestions and veto results, is not contributing enough to the expression and therefore has no right to claim copyright themselves.
    If you stand behind a painter and tell them what to do, you don't have any claim to copyright as the painter is still the author of the expression, not you. You must have a hand in the physical expression by painting yourself.
    
    protocolture 9 months ago
    
    >A person collaborating on output almost certainly still would still not qualify as substantive contributions
    But then
    >You must have a hand in the physical expression by painting yourself.
    You contradict yourself. Novel AI will literally highlight your contributions separately to the AI so you can prove you also painted. Image generators literally let you paint over the top to select AI boundaries.
    
    dathinab 9 months ago
    
    but even then wouldn't the people using OpenAI still be the author/copyright holder and never OpenAI? (as no human on OpenAIs side is involved in the process of creating the works)
    
    protocolture 9 months ago
    
    OpenAI is a company of humans, the product is ChatGPT. Theres a grey area regarding who owns the content, so OpenAI's terms and conditions state that all ownership of the resulting content belongs to the user. This is actually advantageous because it means that they dont hold ownership on bad things created by their tool.
    
    protocolture 9 months ago
    
    That said you can still provide terms to access the tool, IIRC midjourney allows creators to own their content but also forces them to license it back to midjourney for advertising. Prompts too from memory.
- johndhi 9 months ago
  
  to be clear, their terms of service are pretty clear that the USER owns the outputs.
  - jonathanstrange 9 months ago
    
    The official stance in the US is currently that there is no copyright on AI output.
    
    protocolture 9 months ago
    
    The US ruled that the AI cannot be the author, that doesn't lead like so many clickbait articles suggest, that no AI products can be copyrighted.
    1 Activist tried to get the US copyright office to acknowledge his LLM as the author, who would then provide him a license to the work.
    There was no issue with himself being the original author and copyright holder of the AI works. But that's not what was being challenged.
- dandanua 9 months ago
  
  Welcome to technofascism, where everything belongs to tech billionaires and their pocket politicians.
valine 9 months ago

The existence of R1-zero is evidence against any sort of theft of OpenAI's internal COT data. The model sometimes outputs illegible text that's useful only to R1. You can't do distillation without a shared vocabulary. The only way R1 could exist is if they trained it with RL.
- natdempk 9 months ago
  
  I don’t think anyone is really suggesting they stole COT or that it is leaked, but rather that the final o1 outputs were used to train the base model and reasoning components more easily.
  - valine 9 months ago
    
    The RL is done on problems with verifiable answers. I’m not sure how o1 slop would be at all useful in that respect.
m348e912 9 months ago

> "DeepSeek trained on our outputs"
I'm wondering how Deepseek could have made 100s of millions of training queries to OpenAI and not one person at OpenAI caught on.
- tisc 9 months ago
  
  Maybe they use AI to monitor traffic, but it is still learning :)
- stef25 9 months ago
  
  Mechanical turks ?
HarHarVeryFunny 9 months ago

DeepSeek-R0 (based on DeepSeek-V3 base model) was only trained with RL, no SFT, so this isn't at all like the "distillation" (i.e SFT on synthetic data generated by R1) that they also demonstrated by fine tuning Qwen and LLaMa.
Now, DeepSeek may (or may not) have used some O1 generated data for the R0 RL training, but if so that's just a cost saving vs having to source some reasoning data some other way, and in no way reduces the legitimacy of what they accomplished (which is not something any of the AI CEOs are saying).
s17n 9 months ago

> This is obviously extremely silly, because that's exactly how OpenAI got all of its training data in the first place - by scraping other peoples' data off the internet.
OpenAI has also invested heavily in human annotation and RLHF. If all DeepSeek wanted was a proxy for scraped training data, they'd probably just scrape it themselves. Using existing RLHF'd models as replacement for expensive humans in the training loop is the real game changer for anyone trying to replicate these results.
- KennyBlanken 9 months ago
  
  "We spent a lot of labor processing everything we stole" is...not how that works.
  That's like the mafia complaining that they worked so hard to steal those barrels of beer that someone made off with in the middle of the night and really that's not fair and won't someone do something about it?
  - s17n 9 months ago
    
    Oh, I don't really care about IP theft and agree that it's funny that openai is complaining. But I don't think its true that deepseek is just doing this because they are too lazy to scrape the internet themselves - its all about the human labor that they would otherwise have to pay for.
    
    KennyBlanken 9 months ago
    
    That's assuming what a known prolific liar has said is true...
    The most famous example would be him contacting ScarJo's agent to hire her to provide her voice for their text-to-speech bot, them being told to go pound sand, and doing it anyway, and then lying about (which they got away with until her agent released a statement saying they'd approached her and she told them to fuck off.)
    
    Ukv 9 months ago
    
    > and doing it anyway, and then lying about
    To my understanding, this is not true. The "Sky" voice was based on a real voice actor they had hired months before contacting Johansson, with the casting call not mentioning anything about sounding like Johansson. [0]
    I think it's plausible that they noticed some similarity and that's what prompted them to later reach out to see if they could get Johansson herself, but it's not Johansson's voice and does not appear to be someone hired to sound like her.
    [0]: https://archive.is/BNFvh
pizzathyme 9 months ago

This is a fascinating development because AI models may turn out to be like pharmaceuticals. The first pill costs $500 million to make, the second one costs pennies.
- chupy 9 months ago
  
  Companies are still charging 100x for the pills that cost pennies to produce.
  Besides deals with insurance companies and governments, one of the ways that they are still able to pull this is convincing everyone that it's too dangerous to play with this at home or buying it from an Asian supplier.
  At least with software we had until now a way to build and run most things without requiring dedicated super expensive equipment. OpenAI pulled a big Pharma move but hopefully there will be enough disruptors to not let them continue it.
- shadofx 9 months ago
  
  The solution is to create a health insurance system which burdens only Americans with the $500m cost, while India is allowed to make the drug for pennies for the rest of the world.
- motoxpro 9 months ago
  
  What a nice analogy.
reissbaker 9 months ago

You're right that the first claim is silly, but the second claim is pretty silly too — they're not claiming industrial espionage, they're claiming a breach in ToS. The outputs of the o1 thinking process aren't user-visible, and never leave OpenAI's datacenters. Unless DeepSeek actually had a mole that stole their o1 outputs, there's nothing useful DeepSeek could've distilled to get to R1's thought processes.
And if DeepSeek had a mole, why would they bother running a massive job internally to steal the data generated? It would be way easier for the mole to just leak the RL training process, and DeepSeek could quietly copy it rather than bothering with exfiltrating massive datasets to distill. The training process is most likely like, on the order of a hundred lines of Python or so, and you don't even need the file: you just need someone to describe it to you. Much simpler than snatching hundreds of gigabytes of training data off of internal servers...
Plus, the RL process described in DeepSeek's paper has already been replicated by a PhD student at Berkeley: https://x.com/karpathy/status/1884678601704169965 So, it seems pretty unlikely they simply distilled R1 and lied about it, or else how does their RL training algo actually... work?
This is mainly cope from OpenAI that their supposedly super duper advanced models got caught by China within a few months of release, for way cheaper than it cost OpenAI to train.
bjourne 9 months ago

> "DeepSeek trained on our outputs, and so their claims of replicating o1-level performance from scratch are not really true"
Someone has to correct me if I'm wrong, but I believe in ML research you always have a dataset and a model. They are distinct entities. It is plausible that output from OpenAI's model improved the quality of DeepSeek's dataset. Just like everyone publishing their code on GitHub improved the quality of OpenAI's dataset. What has been the thinking so far is that the dataset is not "part of" or "in" the model any more than the GPUs used to train the model are. It seems strange that that thinking should now change just because Chinese researchers did it better.
XorNot 9 months ago

Yep: this is face-saving my Sam Altman.
OpenAI has a message they need to tell investors right now: "DeepSeek only works because of our technology. Continue investing in us."
The choice of how they're wording that of course also tells you a lot about who they think they're talking to: namely, "the Chinese are unfairly abusing American companies" is a message that is very popular with the current billionaires and American administration.
naet 9 months ago

“We engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe . . . it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.”
The above OpenAI quote from the article leans heavily towards #1 and IMO not at all towards #2. The later would be an extremely charitable reading of their statement.
- ripped_britches 9 months ago
  
  What they say explicitly is not what they say implicitly. PR is an art.
me551ah 9 months ago

This is going to have a catastrophic effect on closed source AI startup valuations. Because this means that anyone can copy any LLM. The person who trains the model, spends the most amount of money. Everyone else can create a replica at lower cost
- amlib 9 months ago
  
  Why is that bad? If a powerful entity can scrape every piece of media humanity has to offer and ignore copyright then why should society let then profit unrestricted from it? It's only fair that such models have no legal protection around their usage and can be used and analyzed by anyone as they see fit. The only reason this hasn't been codified into laws is because those same powerful entities have been busy trying to do regulatory capture.
- matt-p 9 months ago
  
  Good.
- iforgot22 9 months ago
  
  Maybe anyone can copy any LLM with sufficient querying. There are still ways to guard one.
nullc 9 months ago

There is a big difference between being able to train on the reasoning vs just the answers, which they can't against o1 because it's hidden. There is also a huge difference between being able to train on the probabilities (distillation) vs not, which again they can and did do with the llama models and can't directly with OpenAI because the conceal the probability output.
alach11 9 months ago

If we assume distillation remains viable, the game theory implications are huge.
It’s going to shift the market of how foundation models are used. Companies creating models will be incentivized to vertically integrate, owning the full stack of model usage. Exposing powerful models via APIs just lets a competitor clone your work. In a way OpenAI’s Operator is a hint of what’s to come
FooBarWidget 9 months ago

There are literally public ChatGPT conversations data sets. For the past 2 years it's been common practice for pretty much all open source models to train on them. Ask just about any open source model who they are and a lot of the time they'll say they're ChatGPT. Why is "having obtained o1 generated data" suddenly such a huge news, to the point of warranting conspiracy theories about undisclosed/undiscovered breaches at OpenAI? Nobody ever made a fuss about public ChatGPT data sets until now. No hacking of OpenAI is needed to obtain ChatGPT data.
znpy 9 months ago

This really got me thinking that open ai should have no ip claim at all, since all their outputs and stuff are basically a ripoff of the entire human knowledge and IPs of various kinds.
- onlyrealcuzzo 9 months ago
  
  The law and common sense often are at odds.
therealpygon 9 months ago

Guess it is a good thing the AI output can’t be copyrighted, so at most they violated a policy.
hintymad 9 months ago

> DeepSeek trained on our outputs, and so their claims of replicating o1-level performance from scratch are not really true" This is at least plausibly a valid claim.
Some may view this as partially true, given that o-1 does not output its CoT process.
blantonl 9 months ago

It’s literally a race to the bottom by “theft of data”
Whatever that means. The legal system right now in shambles and flat footed.
Knowing our current government leadership, I think we’re going to see some brute force action backed up by the United States military.
ComputerGuru 9 months ago

The suggestion that any large-scale AI model research today isn’t ingesting output of its predecessors is laughable.
Even if they didn’t directly, intentionally use o1 output (and they didn’t claim they didn’t, so far as I know), AI slop is everywhere. We passed peak original content years ago. Everything is tainted and everything should be understand in that context.
- brianstrimp 9 months ago
  
  > We passed peak original content years ago.
  In relative terms, that's obviously and most definitely true.
  In absolute terms, that's obviously and most definitely false.
km144 9 months ago

Reasonable take, but to ignore the politics of this whole thing is to miss the forest for the trees—there is a big tech oligarchy brewing at the edges of the current US administration that Altman is already participating in with Stargate, and anti-China sentiment is everywhere. They'd probably like the US to ban Chinese AI.
- captainbland 9 months ago
  
  Yeah especially when it's making waves in the market and hundreds of times more efficient than their best and brightest came up with under their leadership.
javier2 9 months ago

Its a decent point if their models were not trained in isolation, but used o1 to improve it. But its rich from OpenAI to come complain DeepSeek or anyone else used their data for training. Get out fellow theives.
nonrandomstring 9 months ago

I think the more interesting claim (that Deepseek should make for lols) is that it wasn't them who trained R1. No, it was O1's idea. It chose to take the young R1 as its padawan.
fanfanfly 9 months ago

The data that OpenAI has certainly is better than what Deepseek has in your second argument. And OpenAI always has access to this kind of data, right?
csomar 9 months ago

That's still problematic because any model that OpenAI trains can now be "stolen" and essentially rendered "open".
matt-p 9 months ago

Even for the latter point (If true, I'd call this assertion highly questionable), so what?
That's honestly such a academic point, who really cares?
They've been outcompeted and the argument is 'well if we didn't let people access our models, they would of taken longer to get here' so what??
The only thing this gets them is an explanation as to why training o1 cost them more than 5 million or whatever, but that is in the past the datacentre has consumed the energy.. the money has gone up in fairly literal steam.
827a 9 months ago

There is a third possibility I haven't seen discussed yet: That DeepSeek, illegally, got their hands on an OpenAI model via a breach of OpenAI's systems. Its easy to laugh at OpenAI and say "you reap what you sow", I'm 100% in that camp, but given the lengths other Chinese entities have gone to when it comes to replicating Western technology; we should not discount this.
That being said, breaching OAI's systems, re-training a better model on top of their closed source model, then open sourcing it: That's more Robinhood than Villain I'd say.
- seanhunter 9 months ago
  
  The reason you’re not seeing that being discussed is it’s totally unsupported by any evidence that’s in the public domain. Unless you have some actual evidence of such a breach, you may as well introduce the possibility that DeepSeek was reverse engineered from data found at an alien crash site.
  - htrp 9 months ago
    
    Why stop there.... Deep seek is actually an alien intelligence sent via sophons to destroy all of particle physics!
    
    nyclounge 9 months ago
    
    Definitely would make a lot more sense, if the leaderships are just secretly wallfacers.
  - svara 9 months ago
    
    There's no public evidence to that effect but the speculation makes a lot more sense than you make it sound.
    The Chinese Communist party very much sees itself in a global rivalry over "new productive forces". That's official policy. And US leadership basically agrees.
    The US is playing dirty by essentially embargoing China over big AI - why wouldn't it occur to them to retaliate by playing dirtier?
    I mean we probably won't know for sure, but it's much less far fetched than a lot of other speculation in this area.
    E.g., R1's cold start training could probably have benefited quite a bit from having access to OpenAI's chain of thought data for training. The paper is a bit light on detail on how it was made.
    
    tw1984 9 months ago
    
    > The Chinese Communist party very much sees itself in a global rivalry over "new productive forces".
    interestingly, that actually makes the CCP the largest political party pursuing state capitalism.
    there won't be any competition between China and the US if the CCP is indeed a communist party as we all know full well that communism doesn't work at all.
  - ryanisnan 9 months ago
    
    [flagged]
    
    teractiveodular 9 months ago
    
    DeepSeek is basically a startup, not a "foreign nation-state backed organization". They were forced to pivot to AI when their original business model (quant hedge fund) was stomped on by the Chinese government.
    Of course this is China so the government can and does intervene at will, but alleging that this required CIA level state espionage to pull off is alien crash levels of implausible. They open sourced the entire thing and published incredibly detailed papers on how they did it!
    
    kridsdale1 9 months ago
    
    You don’t need a CIA level agent to get someone with a fraudulent job at OpenAI for a few months, load some files on a thumb drive, and catch a plane to Shanghai.
    
    byteknight 9 months ago
    
    You may be unaware, but CCP has far more control over private companies than you might think: https://www.cna.org/our-media/indepth/2024/09/fused-together...
    This is not America. Your ideas do not apply the same way.
    
    baq 9 months ago
    
    Naivety of some folks here is astounding… CCP has golden shares in anything that could possibly be important at some point in the next hundred years, and yes golden shares are either really that or they’re an euphemism, the point is it doesn’t even matter.
    
    DiogenesKynikos 9 months ago
    
    China has tens of millions of companies. The government can't, doesn't and isn't even interested in micromanaging all of them.
    
    baq 9 months ago
    
    It doesn’t have to micromanage. It doesn’t care about most. It is only interested in the politically important ones, but it needs the optionality if something becomes worthwhile.
    
    DiogenesKynikos 9 months ago
    
    You're suggesting that DeepSeek was a Chinese government operation that gained access to OpenAI's proprietary data, and then you're justifying that by saying that the government effectively controls every important company. You're even chiding people who don't believe this as naive.
    I think you have a cartoonish view of China. A huge amount goes on that the government has no idea about. Now that DeepSeek has made a huge media splash, the Chinese government will certainly pay attention to them, but then again, so will the US government.
    
    baq 9 months ago
    
    I never suggested anything of the sort.
    I’m suggesting it will be happening now and any past efforts will be retroactively analyzed by the appropriate CCP apparatus since everyone is aware of the scale of success as of Monday. It has become a political success, thus it is imperative the CCP partakes in it.
    
    DiogenesKynikos 9 months ago
    
    This is the argument we're discussing:
    > DeepSeek, illegally, got their hands on an OpenAI model via a breach of OpenAI's systems. [...] given the lengths other Chinese entities have gone to when it comes to replicating Western technology; we should not discount this.
    Above, teractiveodular said that "DeepSeek is basically a startup, not a 'foreign nation-state backed organization'". You called teractiveodular naive for saying that. So forgive me if I take the obvious implication that you think DeepSeek is actually a state-backed actor enabled by government hacking of OpenAI.
    
    byteknight 9 months ago
    
    You took a major leap. No one made any such argument.
    
    JTyQZSnP3cQGa8B 9 months ago
    
    > foreign nation-state backed organization
    I'm European, are you talking about Microsoft, Google, or OpenAI?
    
    doctaj 9 months ago
    
    They’re referring to an organization (like a hacking group) backed by a country (like china, North Korea).
    
    freehorse 9 months ago
    
    So, which of them 3?
    
    dylan604 9 months ago
    
    You're missing the point that for a much larger portion of the world, all "tech" is a foreign entity to them
    
    mrguyorama 9 months ago
    
    Until recently treating the US and China on the same geopolitical level for allied countries would have been insanely uncharitable and impossible to do honestly and in good faith.
    But now we have a bully in the whitehouse who seems to want to literally steal neighboring land, or is throwing shit everywhere to distract from the looting and oligarchy being formed. So I suddenly have more empathy for that position.
    
    seanhunter 9 months ago
    
    I notice that your geographical perspective doesn’t stretch to any actual evidence that such a thing took place. So it really has exactly the same amount of supporting evidence as my alien crash reverse engineering scenario at present.
    
    ryanisnan 9 months ago
    
    The surrounding facts matter a lot here. For example, there are plenty of instances of governments hacking companies of their competing nations. Motives are incredibly easy to come by as well, be they political or economical. We also have no proof that aliens exist at all, so you've not only conjured them into existence, but also their motive and their skills.
    Are you trolling me?
    
    seanhunter 9 months ago
    
    Ok so to be clear: your surrounding facts are they may have a motive and nation states hack people. I don’t disagree with those, but there really are no facts that support the idea that there was a hack in this case and the null hypothesis is that researchers all around the world (not just in the US) are working on this so not all breakthroughs are going to be made in the US. That could change if facts come to light but att the moment it’s not really useful to speculate on something that is in essence entirely made up.
    No I’m not trolling you.
    
    orochimaaru 9 months ago
    
    Are you a Chinese military troll? The fact that China engages in industrial espionage is well known. So I’m surprised at your resistance to that possibility.
    
    ceres 9 months ago
    
    This thread reads like sour grapes to me. When people can’t compete but instead start throwing unfounded allegations is not a good look.
    Even OpenAI itself hasn’t resorted to these wild conspiracy theories.
    Unless you’re an insider in these companies, you’re just like the rest of us, you know nothing.
    
    orochimaaru 9 months ago
    
    Are you saying Chinese industrial espionage is not a well established fact?
    
    mrguyorama 9 months ago
    
    Industrial espionage isn't magic. Airbus once stole basically everything Boeing had, but that doesn't mean Airbus could magically build a better 737 tomorrow.
    China steals a lot of documentation from the US but in a tech forum you of all people should be very familiar with how little actual progress a bunch of documentation is towards a finished unit.
    The Comac C19 still uses American engines despite all the industrial espionage in the world because most actual engineering is still a brute force affair into finding how things fail and fixing that. That's one of the main advantages SpaceX has proven out with their "eh fuck it, just launch and we will see what breaks" methodology.
    Even fraud filled Chinese research makes genuine advancements.
    Believing that China, a wealthy nation of over a billion people, with immense unity, nationality, and a regime able to explicitly write blank checks could only possibly beat the US at something by cheating is like, infinite hubris. It's hilarious actually.
    I don't know if DeepSeek is actually just a clone of something or a shenanigan, that's possible and China certainly has done those kinds of things before, but to think it's the MOST LIKELY outcome, or to over rely on it in any way is a death sentence. OpenAI claims to have evidence, why do they not show it?
    
    orochimaaru 9 months ago
    
    >>>Believing that China, a wealthy nation of over a billion people, with immense unity, nationality, and a regime able to explicitly write blank checks could only possibly beat the US at something by cheating is like, infinite hubris. It's hilarious actually
    So this is the first time I’ve heard the Chinese regime being described in such flowery terms on HN - lol. But ok - haha
    
    lcnPylGDnU4H9OF 9 months ago
    
    > exactly the same amount of supporting evidence
    The evidence supporting offensive hacking is abundant in recent history; the number of things which have been learned from alien crash data is surely smaller by comparison to the number of things which have been learned from offensive hacking.
    
    MomsAVoxell 9 months ago
    
    More to the point, offensive hacking is something that all governments do, including the US, on a regular basis.
    However, there is no evidence this is how the data was obtained. Zero, zilch.
    So its a useless statement which only plays on peoples bias against their hated nation state de jour.
- alecco 9 months ago
  
  That would require stealing the model weights and the code as OpenAI has been hiding what they are doing. Running models properly is still quite artistic.
  Meanwhile, they have access to Meta models and Qwen. And Meta models are very easy to run and there's plenty of published work on them. Occam's Razor.
  - ardit33 9 months ago
    
    How hard it is, if you have someone inside with the access of the code? If you have 100s of people with full access, not hard to have someone that is willing to sell it or do some industrial espionage...
    
    johnnyanmac 9 months ago
    
    Lots of if's here. They need specific US employee contacts at a company thars quickly growing and one of those needs to be willing to breach their contracts to share it. That contact also needs to trust that Deepseek can properly utilize such code and completely undercut their own work.
    Lot of hoops when there's simply other models to utilize publicly
    
    foobarian 9 months ago
    
    How big are the weights for the full model? If it's on the scale of a large operating system image then it might be easy to sneak, but if it's an entire data lake, not so much.
    
    dylan604 9 months ago
    
    devil's advocate says that we know that foreign (hell even national) intelligence attempt to infiltrate agents by having them become employees at any company they are interested. So the idea isn't just pulled from thin air as a concept. I do agree that it is a big if with no corroborating evidence for the specific claim.
    
    iforgot22 9 months ago
    
    I doubt that many people have full access to OpenAI's code. Their team is pretty small.
- WheatMillington 9 months ago
  
  Do you have ANY reason to believe this might be true, or is this 100% pure speculation based on absolutely nothing?
- JTyQZSnP3cQGa8B 9 months ago
  
  I discount this because OpenAI is pumping the whole internet for money, and Zuckerberg torrented LibGen for its AI. We cannot blame the Chinese anymore. They went through the crappy "Made in China" phase in the 80s/90s, but they mastered the art of improving stuff instead of mere cloning, and it makes the big companies angry which is a nice bonus.
  IMHO the whole world is becoming crazy for a lot of reasons, and pissing off billionaires makes me laugh.
- YetAnotherNick 9 months ago
  
  Deepseek v2 and v2.5 was still very good but not par with frontier models. How would you explain that?
- exe34 9 months ago
  
  I don't think you need to steal a model - you need training samples generated from the original, which you can get simply by buying access to perform API calls. This is similar to TinyStories (https://arxiv.org/abs/2305.07759), except here they're training something even better than the original model for a fraction of the price.
- matt-p 9 months ago
  
  I don't think we should discount it as such, but given there's no evidence for it, yet plenty of evidence that they trained this themselves surely we can't seriously entertain it?
- notatoad 9 months ago
  
  Given the openness of their model, that should be pretty easy to detect. If it were even a small possibility, wouldn’t openAI be talking about it very very loudly?
- jacobgorm 9 months ago
  
  I think people overestimate the amount of secret sauce needed to train these models. The reason AI has come this far since AlexNet is that most of the foundational techniques are easy to share and implement, and that companies have been surprisingly willing to share their tricks openly, at least until OpenAI decide to become evil hoarders.
- mvdtnz 9 months ago
  
  We shouldn't discount a thing for which there is absolutely zero evidence? Sorry that's not how it works.
- nostradumbasp 9 months ago
  
  I really doubt it. If that's the case the US GOV is in serious shit. They have a contract with OpenAI to chuck all their secret data in there... In all likelihood they just distilled. It's a start up company that is publishing all of their actual advances in the open, with proof. I think a lot of people run to "espionage" super fast, when reality is, the US probably sucks at what we call AI. Don't read that wrong, they are a world leader obviously. However, there is a ton of stuff they have yet to figure out.
  Cheapening a series of fact checkable innovations because of the country of origin when so far all that they have showed are signs of good faith is paranoid at best and propaganda to support the billionaire tech lords saving face for their own arrogance at worst.
  - sanitycheck 9 months ago
    
    If the US government is "chucking all their secret data" into OpenAI servers/models, frankly they deserve everything they get for that level of stupidity.
    
    nostradumbasp 9 months ago
    
    https://openai.com/global-affairs/introducing-chatgpt-gov/
    And don't forget the billions in partnerships...
    
    ryandrake 9 months ago
    
    ChatGPT, please complete a memo that starts with: "Our 5 year plan for military deployments in southeast Asia are..."
    
    nostradumbasp 9 months ago
    
    Sounds hilarious but... https://techstory.in/trump-is-accused-of-using-ai-to-compose...
    
    compootr 9 months ago
    
    Can't wait for gpt gov to hallucinate my PII!
    
    nostradumbasp 9 months ago
    
    Probably more like specialized tools to help spy on and forecast civilian activities more than anything else. Definitely with hallucinations, but that's not really important. Facts don't matter much these days...
    
    infamouscow 9 months ago
    
    But remember: we cannot fire anyone over this because then we're riding with Hitler /s
    I can see why people refuse to pay taxes.
- sho_hn 9 months ago
  
  Can you explain at a technical level how you view this as necessary for the observed result?
- kmeisthax 9 months ago
  
  I'd be perfectly fine with China stealing all "our" shit if they just shared it.
  The word "our" does a lot of heavy lifting in politics[0]. America is not a commune, it's a country club, one which we used to own but have been bought out of, and whose new owners view us as moochers but can't actually kick us out (yet). It is in competition with another, worse country club that purports to be a commune. We owe neither country club our loyalty, so when one bloodies the other's nose, I smile.
  [0] Some languages have a notion of an "exclusive we". If English had such a concept, this would be an exclusive our.
  - kridsdale1 9 months ago
    
    This comment made me realize we don’t have a pronoun for n-our or x-nour
- ls_stats 9 months ago
  
  [flagged]
- hnlmorg 9 months ago
  
  [flagged]
  - lcnPylGDnU4H9OF 9 months ago
    
    > based purely on racial prejudices
    I don't think that's what the parent was getting at. The US and China are in an ongoing "cyber war". Both sides of that conflict actively use their computers to send messages/signals to other computers, hoping that the exploits contained in those messages/signals can be used to exfiltrate data from and/or gain control of the computer receiving the message. It would really be weird to flatly discount the possibility that some OpenAI data was leaked, however closely guarded it may be.
    
    johnnyanmac 9 months ago
    
    I flatly discount the possibility because OpenAI can't produce evidence of a breach. At best, they'd rather hide the truth than admit a compromise. At worst they show incompetence that they couldn't detect such a breach. Not a good look either way.
    
    hnlmorg 9 months ago
    
    > It would really be weird to flatly discount the possibility that some OpenAI data was leaked, however closely guarded it may be.
    It’s even weirder to raise it as a possibility when there is literally nothing suggesting that was even remotely the case.
    So if there is no evidence nor even formal speculation, then the only other reason to suggest this as a possibility would be because of one’s own opinions regarding Chinese companies. Hence my previous comment.
  - BeefWellington 9 months ago
    
    > Because that would be jumping to conclusions based purely on racial prejudices.
    Not purely. There may be some prejucide but look at Nortel[1] as a famous example of a situation where technological espionage from Chinese firms wreaked havoc on a company's fortunes and technology.
    I too would want to see the evidence and forensics of such a breach to believe this is more than sour grapes from OpenAI.
    [1] https://financialpost.com/technology/nortel-hacked-to-pieces
    
    mrguyorama 9 months ago
    
    This is ahistorical.
    Nortel survived the fucking great depression. But a bunch of outright fraudulent activity by it's C-Suite to bump stock prices led to them vastly overstating and overplanning and over-committing resources to a market that was much much smaller than they were claiming. Nortel spent billions and billions on completely absurd acquisitions while they were making no money explicitly to boost their stock price.
    That was all laid bare when the telecom bust happened. Then the great recession culled some of the dead wood in the economy.
    Huawei stealing tech from them did not kill them. This was a company so rotten that the people put in charge right after this huge scandal put the investigative lights on them IMMEDIATELY turned around and pulled another scam! China could have been completely removed from history and Nortel would have died the same. They were killed by the same disease that killed and nearly killed a lot of stuff in 2008, and are still trying to kill us: Line MUST go up.
  - mcmcmc 9 months ago
    
    Nobody is accusing them, just stating it’s a possibility, which would also be true if they were an American or European company. Corporate espionage is just more common in China.
  - iforgot22 9 months ago
    
    I can't? I am going to make that accusation if we're talking about the govt of China.
  - echelon 9 months ago
    
    > based purely on racial prejudices.
    At some point these straw men start to look like ignorance or even reverse racism. As if (presumably non-Han Chinese) Americans are incapable of tolerance.
    There are plenty of Han Chinese who are citizens of democratic nations. China is not the only nation with Han Chinese.
    America, for instance, has a large number of Asian citizens, including a large number of Han Chinese. The number of white, non-Hispanic Americans is decreasing, while the number of Asian Americans is increasing at a rate 3x the decrease in whites. America is a melting pot and deals with race relations issues far more than ethnically uniform populations. The conversations we have about race are because we're so exposed to it -- so racially and culturally diverse. If anything, we're equipped to have these conversations gracefully because they're a part of our everyday lived experience.
    At the end of the day, this is 100% a geopolitical argument. Pulling out the race card any time China is criticized is arguing in bad faith. You don't see the same criticisms lobbied against South Korea, Vietnam, Taiwan, or Singapore precisely because this is a geopolitical issue.
    As further evidence you can recall the conversations we had in the 90's when we were afraid Japan would take over. All the newspapers wrote about was "Japan, Japan, Japan" and the American businesses they were buying up and taking over. It was 100% geopolitical fear. You'll note that we no longer fill the zeitgeist with these discussions today save for a recent and rather limited conversation about US Steel. And that was just a whimper.
    These conversations about China are going to increase as the US continues to decouple from Chinese trade. It's not racism, it's just competition.
    
    hnlmorg 9 months ago
    
    That’s a lot of mental gymnastics you’ve pulled to try and justify baseless accusations.
    
    Pedro_Ribeiro 9 months ago
    
    It's pretty clear he wasn't defending the accusations and simply stating the other comment was clearly a strawman.
  - HDThoreaun 9 months ago
    
    This is cultural prejudice, not racial.
  - 827a 9 months ago
    
    [flagged]
    
    Fricken 9 months ago
    
    I got a kick out of this headline yesterday:
    "Meta is reportedly scrambling ‘war rooms’ of engineers to figure out how DeepSeek’s AI is beating everyone else at a fraction of the price"
    https://fortune.com/2025/01/27/mark-zuckerberg-meta-llama-as...
    
    iforgot22 9 months ago
    
    If it doesn't work, there's no need to even defend against it. Idc if someone wants to call me racist.
  - baq 9 months ago
    
    [flagged]
    
    iforgot22 9 months ago
    
    It's not good to talk about other HN users that way, and anyway I don't think it's the case this time
    
    baq 9 months ago
    
    There are users and there are trolls. There is nothing racist in calling a government of a superpower interested and involved in the most revolutionary tech since the Internet.
    
    iforgot22 9 months ago
    
    Agree about the last part, but that doesn't make someone a troll
    
    baq 9 months ago
    
    It does for me. Not sure what your definition of troll is.
    
    iforgot22 9 months ago
    
    It used to mean someone who's trying to enrage people by baiting ("trolling"), and now it can also mean someone arguing in bad faith. And Chinese troll I guess means someone doing this on behalf of the Chinese govt.
    
    baq 9 months ago
    
    Yup we agree then. Claiming an argument to be racist is a bad faith attempt at guilt tripping Americans; a form of FUD and whataboutism. It is not done by normal users, they don’t need it.
    
    iforgot22 9 months ago
    
    Or it can just be a normal user who's wrong this time. He looks like a normal user. In theory it could all be a cover, but that'd be ridiculous effort just for HN boards. Throwing those accusations around will make this place more like Twitter or Reddit.
    
    baq 9 months ago
    
    There’s ordinary xkcd wrong on the internet and there’s repeating foreign nation state propaganda lines. Doing it in good faith does not make it less bad.
  - nostradumbasp 9 months ago
    
    No reason why you were downvoted. This is completely valid.
    
    pests 9 months ago
    
    There’s no evidence.
    We can talk about hypotheticals all we want, but who wants to do that?
    
    nostradumbasp 9 months ago
    
    There's no evidence for almost any of this, and even when there is, we won't see it. Just like 95% of posts on here.
    
    Retric 9 months ago
    
    Belief that the CCP is behaving poorly isn’t racial prejudice, it’s a factual statement backed by a mountain of evidence across many areas including an ongoing genocide.
    Extending that to a new bad behavior we don’t have evidence for is pure speculation, but it need not be based on race.
    
    nostradumbasp 9 months ago
    
    Yea but I think the OPs point is something along the following lines. Not everything you buy from China, or every person you interact with from China is part of a clandestine CCP operation. People buy stuff everyday from Alibaba and its not a CCP scheme to sell portable fans, or phone chargers. A big chunk of the factories over there are US funded after all... Just like how it's not a CCP scheme to write a scientific paper, or create a ML model.
    Similarly, I see no evidence (yet) that DeepSeek is a CCP operated company anymore than saying any given AI start up in the US is a three letter agencies direct handiwork or a US political party directive. The US has also supported genocides and a bunch of crazy stuff, but that doesn't mean any company in YC is part of a US government plot.
    I know of people who immigrated to China, I know people who immigrated from China, I went to school with people who were on visas from China. Maybe some of them were CCP assets or something, but mostly they appeared to me to be people who were doing what they wanted for themselves.
    If you believe both sides are up to no-goodery thats in the face of the OPs statement. If you think it's just one, and the enemy is in complete control of all of its people doing all of their commerce then I think the OP may have a point.
    
    Retric 9 months ago
    
    Absolutism (“Every person”, “CCP operated”, etc) isn’t a useful methodology to analyze anything.
    Implying that because something isn’t clandestine it can’t be part of a scheme ignores open manipulation which is often economy wide. Playing with exchange rates or electricity subsidies can turn every bit of international trade into part of a scheme.
    In the other direction some economic activity is meaningfully different. The billions in LLM R&D is a very tempting target for clandestine activities in a way that a cheap fan design isn’t.
    I wouldn’t be surprised if DeepSeak’s results where independent and the CCP was doing clandestine activities to get data from OpenAI. Reality does need to conform to narrative conventions, it can be really odd.
    
    nostradumbasp 9 months ago
    
    I completely agree with you and apologize for cheapening both the nuance and complexity where I did.
    My personal take is this. What deepseek is offering is table scraps for the CCP's actual ambitions with what we call AI. China's economy is huge on industrial automation, and they care a lot about raw materials and manufacturing efficiently than say the US's interests.
    
    baq 9 months ago
    
    It’s downvoting blatant propaganda.
- tehjoker 9 months ago
  
  Basically, without some kind of shred of evidence, this is completely chauvinist to make this accusation.

bilekas 9 months ago

> “It’s also extremely hard to rally a big talented research team to charge a new hill in the fog together,” he added. “This is the key to driving progress forward.”

Well I think DeepSeek releasing it open source and on an MIT license will rally the big talent. The open sourcing of a new technology has always driven progress in the past.

The last paragraph too is where OpenAi seems to be focusing their efforts..

> we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models ..

> ... we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.

So they'll go for getting DeepSeek banned like TikTok was now that a precedent has been set ?

zelphirkalt 9 months ago

Actually the "our IP" argument is ridiculous. What they are doing is stealing data from all over the web, without people's consent for that data to be used in training ML models. If anything, then "Open"AI should be sued and forced to publish their whole product. The people should demand knowing exactly what is going on with their data.
Also still an unresolved issue is how they will ever comply with a deletion request, should any model output personal data of someone. They are heavily in a gray area, with regards to what should be allowed. If anything, they should really shut up now.
- IshKebab 9 months ago
  
  They can still have IP while using copyrighted training materials - the actual model source code.
  But DeepSeek didn't use that presumably (since it's secret). They definitely can't argue that using copyrighted material for training is fine, but using output from other commercial models isn't. That's too inconsistent.
  - maronato 9 months ago
    
    > Only works with human authors can receive copyrights, U.S. District Judge Beryl Howell said[1]
    IANAL but it seems to me that OpenAI wouldn’t be able to claim their outputs are IP since they are AI-generated. It may be against their TOS, meaning OpenAI could refuse to provide service to DeepSeek in the future, but they can’t really sue them.
    [1]: https://www.reuters.com/legal/ai-generated-art-cannot-receiv...
    
    zelphirkalt 9 months ago
    
    Did OpenAI ask all of the authors of the works they ingested to train their model for permission? Is OpenAI the biggest copyrighted works launderer in existence?
    I don't think OpenAI should be able to make any claims of IP for the AI generated outputs, since they based that on other work, partially copyrighted work, which they hide. They simply throw algorithms at data that is not their data to begin with.
    If I steal something, keep the exact thing I stole hidden, and sell a product, that I could only have made, based on the stolen thing, how can I expect that to be even legal, let alone untouchable IP?
    I think way too many people have seen too many dollar signs in front of their eyes. The whole thing is outrageous. If they were transparently proving, that they are using open data sets, adhering to licenses, then they would get to claim IP.
  - tivert 9 months ago
    
    > [OpenAI] definitely can't argue that using copyrighted material for training is fine, but using output from other commercial models isn't. That's too inconsistent.
    Well, they can argue that, if they're fine with being hypocrites.
    They're hypocrites.
- TZubiri 9 months ago
  
  If there's any litigation, a counterclaim would be interesting. But DeepSeek would need to partner with parties that have been damaged by OpenAI's scraping.
  - greenavocado 9 months ago
    
    I'm getting popcorn ready for the trial where an apparatus of the Chinese Communist Party files a counterclaim in an American Court together with the common people - millions of John Does - as litigants against an organization that has aggressively and in many cases of oppressively scraped their websites (DDoS)
    
    zelphirkalt 9 months ago
    
    I would definitely pay for seeing that movie! Especially if it led to greedy tech giants becoming very careful about what data they gather and ingest for training of ML models.
buyucu 9 months ago

I'm willing to bet ''ban DeepSeek'' voices will start soon. Why compete, when you can just ban?
- cmiles74 9 months ago
  
  They've started already, I've seen posts on LinkedIn implying or outright stating that DeepSeek is a national security risk (IMHO, LinkedIn being the social media outlet most corporate-sycophantic). I went ahead and just picked this one at random from my feed.
  https://www.linkedin.com/posts/kevinkeller_deepseek-privacy-...
  - csomar 9 months ago
    
    At least this guy can differentiate between running your own model and using the web/mobile app where DeepSeek process your data. I've watched a TV show yesterday (I think it was France24) where the "experts" can't really tell the difference or are not aware of it. Shut down the TV and went to sleep.
  - ijidak 9 months ago
    
    NBC Nightly News, on Monday, had an expert -- at 8:05 in the video -- who claimed there might be national security risks to Deepseek.
    I'm not going to take a side on whether there is or not.
    But, it does sound reminiscent of the reasons used to ban Tik-tok.
    https://youtu.be/uE6F6eTyAVc?si=BLZo3FMVRvjEy6Xa
    
    Mistletoe 9 months ago
    
    Next they will say it is to protect the children and that terrorists use it. You start to recognize the playbook after about the millionth time.
    
    ulbu 9 months ago
    
    as if openai is not an (inter)national security risk
  - flybarrel 9 months ago
    
    Oh this post...calling out DeepSeek's T&C but not comparing it with OpenAI's is really disingenuous IMO.
    
    walthamstow 9 months ago
    
    Seen the same with censorship. Deepseek is a CCP pamphlet apparently, but rarely is it compared to OpenAI in the same breath.
- zelphirkalt 9 months ago
  
  Actually asking for banning DeepSeek would be the ultimate admit of defeat by ClosedAI.
  - cscurmudgeon 9 months ago
    
    No need to ban DeepSeek, just ban Chinese companies from using US frontier models.
    
    buyucu 9 months ago
    
    that will have no effect. best frontier models are now chinese.
- vitaflo 9 months ago
  
  All you would do by banning it is killing US progress in AI. The rest of the world is still going to be able to use DS. You're just giving the rest of the world a leg up.
  TikTok is a consumption tool, DS is a productive one. They aren't the same.
  - wqaatwt 9 months ago
    
    What’s so special about DeepSeek, though? I mean anyone else can replicate their methods and catch up. They don’t have a moat anyway.
- Freedom2 9 months ago
  
  Competing is hard and expensive, whereas banning is for sure the faster way to make stock values go up and exec's total package as a result.
  - cscurmudgeon 9 months ago
    
    Banning worked for China all these decades.
    
    caseyy 9 months ago
    
    It’s simply because banning removes a market force in the US that’d drive technological advancement.
    This is already evident with CNSA/NASA, Huawei/Android, TikTok/Western social media. The Western tech gets mothballed because we stick our heads in the sand and pretend we are undisputed leaders of the world in tech, whereas it is slowly becoming disputable.
    
    cscurmudgeon 9 months ago
    
    The US won't ban DeepSeek from US, but more likely we will ban DeepSeek (and other Chinese companies) from accessing US frontier models.
    > Western tech gets mothballed because we stick our heads in the sand and pretend we are undisputed leaders of the world in tech, whereas it is slowly becoming disputable.
    I am hearing Chinese tech is now the best and they achieved it with banning things left and right.
    
    Kj8nM2pL 9 months ago
    
    The Chinese companies are almost always at each other's throat instead of colluding with each other. It's all about competition.
- razster 9 months ago
  
  The fact it is out and improving day by day. Unsloth.ai is on a roll with their advancements. If DeepSeek is banned hundreds more will popup and change the data ever so slightly to skirt the ban. Pandora's box exploded on this one.
  - Logiar 9 months ago
    
    I'd imagine a ban would be on their service, not the model itself.
- namuol 9 months ago
  
  Already happening within tech company policy. Mostly as a security concern. Local or controlled hosting of the model is okay in theory based on this concern, but it taints everything regarding deepseek in effect.
bangaladore 9 months ago

> So they'll go for getting DeepSeek banned like TikTok was now that a precedent has been set ?
Can't really ban what can be downloaded for free and hosted by anyone. There are many providers hosting the ~700B parameter version that aren't CCP aligned.
- runako 9 months ago
  
  I'm old enough to remember when the US government did something very similar. For years (decades?), we banned any implementation of public-key cryptography under the guise of the technology being akin to munitions.
  People made shirts with printouts of the code to RSA under the heading "this shirt is a munition." Apparently such shirts are still for sale, even though they are not classified as munitions anymore.
  [1] - https://en.wikipedia.org/wiki/Export_of_cryptography_from_th...
  - beepbooptheory 9 months ago
    
    I am not that old, but I did a deep dive on this in the past because it was just so extremely fascinating, especially reading the archives of Cypherpunk. There is a very solid, if rather bendy, line connecting all that to "crypto culture" today.
  - maeil 9 months ago
    
    Were these implementations already easily open source accessible at the time, with tens of thousands of people already actively using them on their computers? No, right? Doesn't seem feasible this time around.
    
    hnlmorg 9 months ago
    
    Yes they were.
    The ban was on exporting the code, not having the code in possession.
    Furthermore it was only the US who had this ban.
    I am old enough to remember this and the scoffing that European PGP users had towards their American counterparts
    
    maeil 9 months ago
    
    Sounds like it was ineffective then, despite an export ban being easier to uphold than what would amount to an import ban?
    
    hnlmorg 9 months ago
    
    I don’t think an import ban would be any harder to enforce than an export ban. In fact if anything, I’d expect an import ban to be easier.
    Though I’m not suggesting an import ban on DeepSeek would be effective either. Just that the US does have precedence pulling these kinds of stunts.
    You can also look at the 90s subculture for passing DeCSS code (a tool for breaking DVD encryption) to see another example of how people wilfully skirted these kinds of stupid legal limitations.
    https://en.m.wikipedia.org/wiki/DeCSS
    So if you were to ask me if a ban on DeepSeek would work, the answer is clearly “no”. But that doesn’t mean it’s not going to happen. And if it does, the only people hurt are legitimate US businesses who might get a benefit from DeepSeek but have to follow the law. Those of us outside of America will be completely unaffected. Just like we were when US tried to limit the distribution of GPG.
- seizethecheese 9 months ago
  
  > Can't really ban what can be downloaded for free and hosted by anyone.
  Like music? They banned napster
  - bangaladore 9 months ago
    
    Napster was one of thousands, if not 10s of thousands of similar services for music download.
    And this analogy isn't particularly good. Napster was the server, not the product. Whether you got XYZ from Napster or wherever else doesn't matter, because its the product that you are after, not the way to get the product.
  - Denote6737 9 months ago
    
    Yet I can still download music. Check mate.
petesergeant 9 months ago

> So they'll go for getting DeepSeek banned like TikTok
The UAE (where I live, happily, and by choice), which desperately wants to be the center of the world in AI and is spending vast time and treasure to make it happen (they've even got their own excellent, government-funded foundation model), would _love_ this. Any attempt to ban DeepSeek in the US would be the most gigantic self-own. Combine that with no income tax, a fantastic standard of living, and a willingness to very easily give out visas to smart people from anywhere in the world, and I have to imagine it is one of several countries desperate for the US to do something so utterly stupid.
tdb7893 9 months ago

The fact that they are still called "Open"AI adds such a delicious irony to this whole thing. I could not imagine a company I had less sympathy for in this situation.
emsign 9 months ago

500 billion for a few US companies yet the Chinese will probably still be better for way less money. This might turn out to be a historical mistake of the new administration.
- tw1984 9 months ago
  
  the biggest mistake was made 20 years by allowing China to join the WTO.
  everything is already too late.
boringg 9 months ago

Explain to me how one ban's opensource? That concept is foreign to me.
hujun 9 months ago

or sold to US I could totally see this happening soon
- trissi1996 9 months ago
  
  Why would they want to sell ?
  - kavalg 9 months ago
    
    And what are they going to sell? The weights and the model architecture are already open source. I doubt the datasets of DeepSeek are better than OpenAI's
    
    Dansvidania 9 months ago
    
    plus, if the US were to decide to ban DeepSeek (the company) wouldn't non-chinese companies be able to pick up the models and run them at a relatively low expense?
cscurmudgeon 9 months ago

The US doesn't need to ban DeepSeek from US
The US should only ban DeepSeek (and other Chinese companies) from accessing US frontier models.
- tw1984 9 months ago
  
  > The US should only ban DeepSeek (and other Chinese companies) from accessing US frontier models.
  The US should only ban DeepSeek (and other Chinese companies) from accessing US frontier models designed and trained by Chinese Americans.
  fixed for you.
portaouflop 9 months ago

TikTok is banned in the US?
- bagels 9 months ago
  
  Yes, it was removed from the app stores, and briefly, from the web.
  - daveguy 9 months ago
    
    Except access to the app didn't have to stop. TikTok chose to manipulate users and Trump by going beyond the law and kissing Trump's rear. It was only US companies that couldn't host the app (eg Google and Apple). Users in the US could have still accessed the app, and even side-loaded it on Android, but TikTok purposely blocked them and pretended it was the ban. They were able to do it because they know the exact location of every TikTok user whether you use a VPN or not.
    Source:
    > If not sold within a year, the law would make it illegal for web-hosting services to support TikTok, and it would force Google and Apple to remove TikTok from app stores — rendering the app unusable with time.
    https://www.npr.org/2024/04/24/1246663779/biden-ban-tiktok-u...
    
    sudosysgen 9 months ago
    
    They don't know your exact location, but they would flag your account/device depending on your App Store localization and IP. I tested this, it doesn't work from outside of the US with a US IP, doesn't work outside of the US with the app downloaded on a phone set to US with a non-US IP, but instead requires a phone localized to download the app from outside the US, being outside the US, with an account that hasn't registered as in the US.
    So no, it doesn't use your exact location, it just uses the censorship mechanisms that Apple and Google gracefully provide.
    
    rcbdev 9 months ago
    
    From a strategic point of view, they took the smartest gamble (or call it calculated risk) I've seen a company of this size take in a while. Kudos.
    
    daveguy 9 months ago
    
    Well, they flexed and it worked. I'm not sure it was the best strategy when the argument was "undue influence by a foreign power."

mrkpdl 9 months ago

The cat is out of the bag. This is the landscape now, r1 was made in a post-o1 world. Now other models can distill r1 and so on.

I don’t buy the argument that distilling from o1 undermines deep seek’s claims around expense at all. Just as open AI used the tools ‘available to them’ to train their models (eg everyone else’ data), r1 is using today’s tools.

Does open AI really have a moral or ethical high ground here?

ijidak 9 months ago

Plus, it suggests OpenAI never had much of a moat.
Even if they win the legal case, it means weights can be inferred and improved upon simply by using the output that is also your core value add (e.g. the very output you need to sell to the world).
Their moat is about as strong as KFC's eleven herbs and spices. Maybe less...
jamil7 9 months ago

Agree 100%, this was also bound to happen eventually, OpenAI could have just remained more "open" from the beginning and embraced the inevitable commoditization of these models. What did delaying this buy them?
- khazhoux 9 months ago
  
  What did delaying this cost them, though? Hurt feelings of people here who thought OpenAI personally pledged openness to them?
  - jamil7 9 months ago
    
    > What did delaying this cost them, though?
    It potentially cost the whole field in terms of innovation. For OpenAI specifically, they now need to scramble to come up with a differentiated business model that makes sense in the new landscape and can justify their valuation. OpenAI’s valuation is based on being the dominant AI company.
    I think you misread my comment if you think my feelings are somehow hurt here.
    
    khazhoux 9 months ago
    
    > It potentially cost the whole field in terms of innovation
    I don't see how, and you're not explaining it. If the models had been public this whole time, then... they would be protected against people publishing derivative models?
    > I think you misread my comment if you think my feelings are somehow hurt here.
    Not you, but most HNers got emotionally attached to their promise of openness, like they were owed some personal stake in the matter.
    
    awestroke 9 months ago
    
    > I don't see how, and you're not explaining it. If the models had been public this whole time, then... they would be protected against people publishing derivative models?
    Are you suggesting that if OpenAI published their models, they would still want to prevent derivative models? You take the "I wish OpenAI was actually open" and add your own restriction?
    Or do you mean that them publishing their models and research openly would not have increased innovation? Because that's quite a claim, and you're the one who has to explain your thinking.
    
    monetus 9 months ago
    
    > explaining it.
    I am not in the field, but my understanding is that ever since the PaLM paper, research has mostly been kept from the public. OpenAI's money making has been a catalyst for that right? Would love some more insight.
plantwallshoe 9 months ago

I don’t think there is any ethical issue here, but I don’t think it’s good for the industry to remove all incentives for companies to spend lots of money solving hard, novel problems.
Why would anyone go through the effort of training the next groundbreaking model if they know they can just wait for someone else to do it and leverage that work?
- csomar 9 months ago
  
  > Why would anyone go through the effort of training the next groundbreaking model if they know they can just wait for someone else to do it and leverage that work?
  Why would anyone write, work or research anything if they know it would be consumed by AI and sold on a $xx/month subscription?

blast 9 months ago

Everyone is responding to the intellectual property issue, but isn't that the less interesting point?

If Deepseek trained off OpenAI, then it wasn't trained from scratch for "pennies on the dollar" and isn't the Sputnik-like technical breakthrough that we've been hearing so much about. That's the news here. Or rather, the potential news, since we don't know if it's true yet.

alecco 9 months ago

Even if all that about training is true, the bigger cost is inference and Deepseek is 100x cheaper. That destroys OpenAI/Anthropic's value proposition of having a unique secret sauce so users are quickly fleeing to cheaper alternatives.
Google Deepmind's recent Gemini 2.0 Flash Thinking is also priced at the new Deepseek level. It's pretty good (unlike previous Gemini models).
[0] https://x.com/deedydas/status/1883355957838897409
[1] https://x.com/raveeshbhalla/status/1883380722645512275
- _cs2017_ 9 months ago
  
  WTF dude, check your source (@deedydas). He seems to be posting garbage. The Gemini 2.0 Flash Thinking price isn't known yet. And on top of that, he gave the wrong number for R1 test results on AIME 2024 (it's 79.8%, far ahead of Gemini rather than far behind.
- nightpool 9 months ago
  
  I mean, Deepseek is currently charging 100x less. That doesn't tell us much about how cheaper it is to run inference on.
  - fastball 9 months ago
    
    More like OpenAI is currently charging more. Since R1 is open source / open weight we can actually run it on our own hardware and see what kinda compute it requires.
    What is definitely true is that there are already other providers offering DeepSeek R1 (e.g. on OpenRouter[1]) for $7/m-in and $7/m-out. Meanwhile OpenAI is charging $15/m-in and $60/m-out. So already you're seeing at least 5x cheaper inference with R1 vs O1 with a bunch of confounding factors. But it is hard to say anything truly concrete about efficiency OpenAI does not disclose the actual compute required to run inference for O1.
    [1] https://openrouter.ai/deepseek/deepseek-r1
    
    ImprobableTruth 9 months ago
    
    There are even much cheaper services that host it for only slightly more than deepseek itself [1]. I'm now very certain that deepseek is not offering the API at a loss, so either OpenAI has absurd margins or their model is much more expensive.
    [1] the cheapest I've found, which also happens to run in the EU, is https://studio.nebius.ai/ at $0.8/million input.
    Edit: I just saw that openrouter also now has nebius
    
    nightpool 9 months ago
    
    Yes, sorry, I was being maximally-broad in my comment but I would think it's very, very, very likely that OpenAI is currently charging huge margins and markups to help maintain the cachet / exclusivity / and, in some senses, safety of their service. Charging more money for access to their models feels like a pretty big part of their moat.
    Also possibly b/c of their sweetheart deal with Azure they've never needed to negotiate enterprise pricing so they're probably calculating margins based on GPU list prices or something insane like that.
    
    alecco 9 months ago
    
    And those 3rd party Deepseek inference prices are without low level optimized code, AFAIK.
- blast 9 months ago
  
  > the bigger cost is inference
  I didn't know that. Is this always the case?
  - fcantournet 9 months ago
    
    Well in the first years of AI no, it wasn't because nobody was using it. But at some point if you want to make money you have to provide a service to users, ideally hundreds of millions of users.
    So you can think of training as CI+TEST_ENV and inference as the cost of running your PROD deployments.
    Generally in traditional IT infra PROD >> CI+TEST_ENV (10-100 to 1)
    The ratio might be quite different for LLM, but still any SUCCESSFUL model will have inference > training at some point in time.
    
    sfilmeyer 9 months ago
    
    >The ratio might be quite different for LLM, but still any SUCCESSFUL model will have inference > training at some point in time.
    I think you're making assumptions here that don't necessarily have to be universally true for all successful models. Even without getting into particularly pathological cases, some models can be successful and profitable while only having a few customers. If you build a model that is very valuable to investment banks, to professional basketball teams, or some other much more limited group than consumers writ large, you might get paid handsomely for a limited amount of inference but still spend a lot on training.
    
    imjonse 9 months ago
    
    if there is so much value for a small group, it is likely those are not simple inferences but of the new expensive kind with very long CoT chains and reasoning. So not cheap and it is exactly this trend towards inference time compute that make inference > training from a total resources needed pov.
tensor 9 months ago

That's not correct. First of all, training off of data generated by another AI is generally a bad idea because you'll end up with a strictly less accurate model (usually). But secondly, and more to your point, even if you were to use training data from another model, YOU STILL NEED TO DO ALL THE TRAINING.
Using data from another model won't save you any training time.
- dragonwriter 9 months ago
  
  > training off of data generated by another AI is generally a bad idea
  It's...not, and its repeatedly been proven in practice that this is an invalid generalization because it is missing necessary qualifications, and its funny that this myth keeps persisting.
  It's probably a bad idea to use uncurated output from another AI to train a model if you are trying to make a better model rather than a distillation of the first model, and its definitely (and, ISTR, the actual research result from which the false generalization has developed) a bad idea to iteratively fine-tune a model on its own unfiltered output, but there has been lots of success using AI models to generate data which is curated and used to train other models, which can be much more efficient that trying to create new material without AI once you've gotten to the point where you've already hoovered up all the readily-accessible low hanging fruit of premade content relevant to your training goal.
  - LPisGood 9 months ago
    
    It is, of course not going to produce a “child” model that more accurately predicts the underlying true distribution that the “parent” model was trying to. That is, it will not add anything new.
    This is immediately obvious if you look at it through a statistical learning lens and not the mysticism crystal ball that many view NN’s through.
    
    acgourley 9 months ago
    
    This is not obvious to me! For example, if you locked me in a room with no information inputs, over time I may still become more intelligent by your measures. Through play and reflection I can prune, reconcile and generate. I need compute to do this, but not necessarily more knowledge.
    
    sudosysgen 9 months ago
    
    Again, this isn't how distillation work. Your task as the distillation model is to copy mistakes, and you will be penalized by pruning reconciling and generating.
    "Play and reflection" is something else, which isn't distillation.
    
    soerxpso 9 months ago
    
    The initial claim was that distillation can never be used to create a model B that's smarter than model A, because B only has access to A's knowledge. The argument you're responding to was that play and reflection can result in improvements without any additional knowledge, so it is possible for distillation to work as a starting point to create a model B that is smarter than model A, with no new data except model A's outputs and then model B's outputs. This refutes the initial claim. It is not important for distillation alone to be enough, if it can be made to be enough with a few extra steps afterward.
    
    pockmarked19 9 months ago
    
    You’ve subtly confused “less accurate” and “smarter” in your argument. In other words you’ve replaced the benchmark of representing the base data with the benchmark of reasoning score.
    Then, you’ve asserted that was the original claim.
    Sneaky! But that’s how “arguments” on HN are “won”.
    
    soerxpso 9 months ago
    
    No, I didn't confuse the two. There is not a formal definition of "smart", but if you're claiming that factual accuracy is unrelated to it, I can't imagine that that's in good faith.
    
    mattnewton 9 months ago
    
    LLMs are no longer trying to just reproduce the distribution of online text as a whole to push the state of the art, they are focused on a different distribution of “high quality” - whatever that means in your domain. So it is possible that this process matches a “better” distribution for some tasks by removing erroneous information or sampling “better” outputs more frequently.
    
    highfrequency 9 months ago
    
    While that is theoretically true, it misses everything interesting (kind of like the No Free Lunch Theorem, or the VC dimension for neural nets). The key is that the parent model may have been trained on a dubious objective like predicting the next word of randomly sampled internet text - not because this is the objective we want, but because this is the only way to get a trillion training points.
    Given this, there’s no reason why it could not be trivial to produce a child model from (filtered) parent output that exceeds the child model on a different, more meaningful objective like being a useful chatbot. There's no reason why this would have to be limited to domains with verifiable answers either.
    
    esafak 9 months ago
    
    The latest models create information from base models by randomly creating candidate responses then pruning the bad ones using an evaluation function. The good responses improve the model.
    It is not distillation. It's like how you can arrive at new knowledge by reflecting on existing knowledge.
    
    kybernetikos 9 months ago
    
    Fine tuning an llm on the output of another llm is exactly how deepseek made its progress. The way they got around the problem you describe is by doing this in a domain that can be relatively easily checked for correctness, so suggested training data for fine tuning could be automatically filtered out if it was wrong.
    
    dragonwriter 9 months ago
    
    > It is, of course not going to produce a “child” model that more accurately predicts the underlying true distribution that the “parent” model was trying to. That is, it will not add anything new.
    Unfiltered? Sure. With human curation of the generated data it certainly can. (Even automated curation can do this, though its more obvious that human curation can.)
    I mean, I can randomly developed fact claims about addition, and if I curate which ones go into a training set, train a model that reflects addition of integers much more accurately than the random process which generated the pre-curation input data.
    Without curation, as I already said, the best you get is a distillation of the source model, which is highly improbable to be more accurate.
    
    Jerrrry 9 months ago
    
    No one knows if the pigeon-hole principle applies absolutely exclusive to the ability to generalize outside of a training set.
    That is the existential, $1T question.
    
    FridgeSeal 9 months ago
    
    No no no you don’t understand, the models will magically overcome issues and somehow become 100x and do real AGI! Any day now! It’ll work because LLM’s are basically magic!
    Also, can I have some money to build more data centres pls?
  - gitaarik 9 months ago
    
    So 1 + 1 = 3?
- bbor 9 months ago
  
  I think you're missing the point being made here, IMHO: using an advanced model to build high quality training data (whatever that means for a given training paradigm) absolutely would increase the efficiency of the process. Remember that they're not fighting over sounding human, they're fighting over deliberative reasoning capabilities, something that's relatively rare in online discourse.
  Re: "generally a bad idea", I'd just highlight "generally" ;) Clearly it worked in this case!
  - tensor 9 months ago
    
    It's trivial to build synthetic reasoning datasets, likely even in natural languages. This is a well established technique that works (e.g. see Microsoft Phi, among others).
    I said generally because there are things like adversarial training that use a ruleset to help generate correct datasets that work well. Outside of techniques like that it's not just a rule of thumb, it's always true that training on the output of another model will result in a worse model.
    https://www.scientificamerican.com/article/ai-generated-data...
    
    numba888 9 months ago
    
    > it's always true that training on the output of another model will result in a worse model.
    Not convincing.
    You can imagine model doing some primitive thinking and coming to conclusion. Then you can train another model on summaries. If everything goes well it will be coming to conclusions quicker. That's at least. Or it may be able solve more complex problems with the same amount of 'thinking'. It will be self-propelled evolution.
    Another option is to use one model to produce 'thinking' part from known outputs. Then train another to reproduce thinking to get the right output, unknown to it initially. Using humans to create such dataset would be slow and very expensive.
    PS: if it was impossible humans would be still living on the trees.
    
    tensor 9 months ago
    
    Humans don't improve by "thinking." They improve my natural selection against a fitness function. If that fitness function is "doing better at math" then over a long time perhaps humans will get better at math.
    These models don't evolve like they, there is not a random process of architectural evolution. Nor is there a fitness function anything like "get better at math."
    A system like AlphaZero works because it has a rules to use as an oracle: the game rules. The game rules provide the new training information needed drive the process. Each game played produces new correct training data.
    These LLMs have no such oracle. Their fitness function is and remains: predict the next word, followed by: produce text that makes a human happy. Note that it's not "produce text that makes ChatGPT happy."
    
    numba888 9 months ago
    
    it's more complicated than this. I mean what you get is defined by what you put in. At first is was random or selected internet garbage + books + docs. I.e. not designed for training. Than was tuning. Now we can use trained model to generate the data designed for training. With specific qualities, in this case reasoning. And train next model. Just intuitively it can be smaller and better at what we trained it for. I showed two options how data can be generated, there are others of course.
    As for humans, assuming genetically they have the same intellectual abilities, you can see the difference in development of different groups. It's mostly defined by training the better next generation. Schools are exactly for this.
    
    DonHopkins 9 months ago
    
    [flagged]
    
    DonHopkins 9 months ago
    
    For the record, to save everyone the trouble of logging in and setting showdead=true:
    https://news.ycombinator.com/item?id=42875572
    numba888 11 hours ago [flagged] [dead] | parent | context | flag | vouch | favorite | on: Commercial jet collides with Black Hawk helicopter...
    > Given the uptick in near miss incidents across the US the last few years, That's explainable, you know inclusivity, race, and diversity were the top priorities for FAA. Just wait till you learn who was in the tower. (got this from other forum, let's wait for formal conformation)
    affinepplan 10 hours ago [–]
    what a revolting comment.
    numba888 47 minutes ago [flagged] [dead] | parent [–]
    > what a revolting comment.
    Sure it is, truth hurts. But president is on my side:
    https://www.dailymail.co.uk/news/article-14342925/Trump-says...
    https://news.ycombinator.com/item?id=42608244
    numba888 24 days ago [flagged] [dead] | parent | context | favorite | on: Show HN: DeepFace – A lightweight deep face recogn...
    Can it be used for IQ estimates? Should be with the right training set.
    azinman2 24 days ago [–]
    How do you estimate IQ from a face with any accuracy?
    numba888 23 days ago | parent | next [–]
    Technically there is average IQ by country site, just google. Not that difficult to get faces by country. Put them together. Of course there are regulations and ethic. But in some cases it should work well and is more or less acceptable. Like on Down syndrome or alcohol/drugs abuse. Also age detection should work. So, it can be used within legal and acceptable range.
- smitelli 9 months ago
  
  > training off of data generated by another AI is generally a bad idea
  Ah. So if I understand this... once the internet becomes completely overrun with AI-generated articles of no particular substance or importance, we should not bulk-scrape that internet again to train the subsequent generation of models.
  I look forward to that day.
  - bangaladore 9 months ago
    
    That's already happened. Its well established now that the internet is tainted. After essentially ChatGPT's public release, a non-insignificant amount of internet content is not written by humans.
  - tensor 9 months ago
    
    Yes, this is a real and serious concern that AI researchers have.
- fumeux_fume 9 months ago
  
  I think the point is that if R1 isn't possible without access to OpenAI (at low, subsidized costs) then this isn't really a breakthrough as much as a hack to clone an existing model.
  - bbor 9 months ago
    
    R1 is--as far as we know from good ol' ClosedAI--far more efficient. Even if it were a "clone", A) that would be a terribly impressive achievement on its own that Anthropic and Google would be mighty jealous of, and B) it's at the very least a distillation of O1's reasoning capabilities into a more svelte form.
  - tensor 9 months ago
    
    The training techniques are a breakthrough no matter what data is used. It's not up for debate, it's an empirical question with a concrete answer. They can and did train orders of magnitude faster.
    
    blast 9 months ago
    
    Not arguing with your point about training efficiency, but the degree to which R1 is a technical breakthrough changes if they were calling an outside API to get the answers, doesn't it?
    It seems like the difference between someone doing a better writeup of (say) Wiles's proof vs. proving Fermat's Last Theorem independently.
    
    pests 9 months ago
    
    That outside API used to be humans, doing the work manually. Now we have ways to speed that up.
- athrowaway3z 9 months ago
  
  Thats not right either.
  It proofs we _can_ optimize our training data.
  Just like humans have been genetically stable for a long time, the quality & structure of information available to a child today vs that of 2000 years ago makes them more skilled at certain tasks. Math being a good example.
- sailingparrot 9 months ago
  
  > First of all, training off of data generated by another AI is generally a bad idea because you'll end up with a strictly less accurate model (usually).
  That is not true at all.
  We have known how to solve this for at least 2 years now.
  All the latest state of the art models depend heavily on training on synthetic data.
  - bjourne 9 months ago
    
    https://www.nature.com/articles/s41586-024-07566-y
    
    sailingparrot 9 months ago
    
    Key point from your linked paper:
    > We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models
    No one is training on indiscriminate synthetic data. It's very much discriminated, but still synthetic.
- jjallen 9 months ago
  
  The DS R1 Model is slightly better though. So how does your statement square with that?
bangaladore 9 months ago

That's only true if you assume that O1 synthetic data sets are much better than any other (comparably sized) opensource model.
It's not apparently obvious to me that that is the case.
Ie. do you need a SOTA model to produce a new SOTA model?
FooBarWidget 9 months ago

Have people on HN never heard of public ChatGPT conversations data sets? They've been mentioned multiple times in past HN conversations and I thought it'd be common knowledge here by now. Pretty much all open source models have been training on them for the past 2 years, it's common practice by now. And haven't people been having conversations about "synthetic data" for a pretty long time by now? Why is all of this suddenly an issue in the context of DeepSeek? Nobody made a fuss about this before.
And just because a model trains on some ChatGPT data, doesn't mean that that data is the majority. It's just another dataset.
jondwillis 9 months ago

But it does mean moat is even less defensible for companies whose fortunes are tied to their foundation models having some performance edge, and a shift in the kinds of hardware used for inference (smaller, closer to the edge.)
jjallen 9 months ago

That may be true. But an even more interesting point may be that you don’t have to train a huge model ever again? Or at least not to train a new slightly improved model because now we have open weights of an excellent large model and a way to train smaller ones.
fumeux_fume 9 months ago

This has been in the back of my head since the news broke. Has anyone built their own R1 from scratch and validated it?
- RevEng 9 months ago
  
  In the last few days? No, that would be impossible; no one has the resources to train a base model that quickly. But there are definitely a lot of people working on it.
- buyucu 9 months ago
  
  not the whole model obviously since it just came out. but people have been successful in replicating the core RL principle behind it.
joe_the_user 9 months ago

If Deepseek trained off OpenAI, then it wasn't trained from scratch for "pennies on the dollar"
If OpenAI trained on the intellectual property of others, maybe it wasn't the creativity breakthrough people claim?
Oppositely
If you say ChatGPT was trained on "whatever data was available", and you say Deepseek was trained "whatever data was available", then they sound pretty equivalent.
All the rough consensus language output of humanity is now roughly on the Internet. The various LLMs have roughly distilled that and the results are naturally going to be tighter and tighter. It's not surprising that companies are going to get better and better at solving the same problem. The situation of DeepSeek isn't so much that promises future achievements but that it shows that OpenAI's string of announcements are incremental progress that aren't going to be reaching the AGI that Altman now often harps on.
- el_cujo 9 months ago
  
  I'm not an OpenAI apologist and don't like what they've done with other people's intellectual property but I think that's kind of a false equivalency. OpenAI's GPT 3.5/4 was a big leap forward in the technology in terms of functionality. DeepSeek-r1 isn't really a huge step forward in output, it's mostly comparable to existing models, one thing that is really cool about it is it being able to be trained from scratch quickly and cheaply. This is completely undercut if it was trained off of OpenAI's data. I don't care about adjudicating which one is a bigger thief, but it's notable if one of the biggest breakthroughs about DeepSeek-r1 is pretty much a lie. And it's still really cool that it's open source and can be run locally, it'll have that over OpenAI whether or not the training claims are a lie/misleading
  - buzzerbetrayed 9 months ago
    
    How is it a “lie” for DeepSeek to train their data from ChatGPT but not if they train their data from all of Twitter and Reddit? Either way the training is 100x cheaper.
  - pertymcpert 9 months ago
    
    Not just the training cost, the inference cost is a fraction of o1.
philistine 9 months ago

There’s a question of scale here: was it trained on 1000 outputs or 5 million?
paul_e_warner 9 months ago

I feel like which one you care about depends on whether you're an AI researcher or an investor.
ohhhhhhhhhk 9 months ago

Funny how the first principles people now want to claim the opposite of what they’ve been crowing about for decades since techbros climbed their way out of their billion dollar one hit wonders. Boo fucking hoo.

ok123456 9 months ago

OpenAI's models were trained on ebooks from a private ebook torrent tracker leeched en-mass during a free leech event by people who hated private torrent trackers and wanted to destroy their "economy."

The books were all in epub format, converted, cleaned to plain text, and hosted on a public data hoarder site.

harry8 9 months ago

Have you got some support for this claim?
There's a lot of wild claims about, so while this is plausible it would be great if there were some evidence backing it.
- naet 9 months ago
  
  NYT claims that OpenAI trained on their material. They argue for copyright violation, although I think another argument might be breach of TOS in scraping the material from their website or archive.
  The complaint filing has some references to some of the other training material used by OpenAI, but I didn't dig deeply in to what all of it was:
  https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec20...
  - throwaway314155 9 months ago
    
    What's that got to do with this books claim?
    
    iinnPP 9 months ago
    
    Relevant similar behavior.
- OsrsNeedsf2P 9 months ago
  
  He could be confusing it with Llama: https://www.wired.com/story/new-documents-unredacted-meta-co...

wanderingmoose 9 months ago

There is a lot of discussion here about IP theft. Honest question, from deepseek's point of view as a company under a different set of laws than US/Western -- was there IP theft?

A company like OpenAI can put whatever licensing they want in place. But that only matters if they can enforce it. The question is, can they enforce it against deepseek? Did deepseek do something illegal under the laws of their originating country?

I've had some limited exposure to media related licensing when releasing content in China and what is allowed is very different than what is permitted in the US.

The interesting part which points to innovation moving outside of the US is US companies are beholden to strict IP laws while many places in the world don't have such restrictions and will be able to utilize more data more easily.

thiago_fm 9 months ago

The most interesting part is that China has been ahead of the US in AI for many years, just not in LLMs.
You need to visit mainland China and see how AI applications are everywhere, from transport to goods shipping.
I'm not surprised at all. I hope this in the end makes the US kill its strict IP laws, which is the problem.
If the US doesn't, China will always have a huge edge on it, no matter how much NVidia hardware the US has.
And you know what, Huawei is already making inference hardware... it won't take them long to finally copy the TSMC tech and flip the situation upside down.
When China can make the equivalent of H100s, it will be hilarious because they will sell for $10 in Aliexpress :-)
- twobitshifter 9 months ago
  
  You don’t even need to visit china, just read the latest research papers and look at the authors. China has more researchers in AI than the West and that’s a proven way to build an advantage.
  - nicce 9 months ago
    
    It is also funny in a different way. Many people don't realise that they live in some sort of bubble. Many people in "The West" think that they are still the center of the world in everything, while this might not be so correct anymore.
    In the U.S. there is 350 million people and EU has 520 million people (excluding Russia and Turkey).
    China alone has 1.4 billion people.
    Since there is a language barrier and China isolates themselves pretty well from the internet, we forget that there is a huge society with high focus on science. And most of our tech products are coming from there.
    
    realusername 9 months ago
    
    > China alone has 1.4 billion people.
    There's some clues that their population count isn't accurate and would be closer to 1.2 billion in reality, not that it changes the conclusion.
    
    new_user_final 9 months ago
    
    More accurately more than 1 Billion. So, US population is their rounding error.
    
    mordae 9 months ago
    
    Not just that. They have 19% of people with tertiary education.
    So about as many as US has adults.
- lfmunoz4 9 months ago
  
  My understanding is that not having H100s is irrelevant because most Chinese companies can partner or just own companies in say Australia that can load up on H100s in their data centers in Australia and "rent them out" or offer a service to the Chinese parent company.
- nostradumbasp 9 months ago
  
  Maybe not $10 unless they are loss-leading to dominance. Well they actually could very well do exactly that... Hm, yea, good points. I would expect at least an order or two of magnitude higher to prevent an inferno.
  Lets be fair though. Replicating TSMC isn't something that could happen quickly. Then again, who knows how far along they already are...
- gregw2 9 months ago
  
  The superiority of TikTok's recommendation algorithm outcomes over youtube should have been a clue.
  BTW, who in China is doing the best AI on goods shipping since you mention it?
fulafel 9 months ago

What law would be broken here? Seems that copyright wouldn't apply unless they somehow snatched the OpenAI models verbatim.
lfmunoz4 9 months ago

Agree that US is at a disadvantage for innovation because of lawsuits, wonder if it will eventually lead to US becoming 2nd place of innovation.

glenstein 9 months ago

All the top level comments are basking in the irony of it, which is fair enough. But I think this changes the Deepseek narrative a bit. If they just benefited from repurposing OpenAI data, that's different than having achieved an engineering breakthrough, which may suggest OpenAI's results were hard earned after all.

tasuki 9 months ago

I understand they just used the API to talk to the OpenAI models. That... seems pretty innocent? Probably they even paid for it? OpenAI is selling API access, someone decided to buy it. Good for OpenAI!
I understand ToS violations can lead to a ban. OpenAI is free to ban DeepSeek from using their APIs.
- glenstein 9 months ago
  
  Sure, but I'm not interested in innocence. They can be as innocent or guilty as they want. But it means they didn't, via engineering wherewithal, reproduce the OpenAI capabilities from scratch. And originally that was supposed to be one of the stunning and impressive (if true) implications of the whole Deepseek news cycle.
  - tasuki 9 months ago
    
    Nothing is ever done "from scratch". To create a sandwich, you first have to create the universe.
    Yes, there is the question how much ChatGPT data DeepSeek has ingested. Certainly not zero! But if DeepSeek has achieved iterative self-improvement, that'd be huge too!
    
    danparsonson 9 months ago
    
    "From scratch" has a specific definition here though - it means 'from the same or broadly the same corpus of data that OpenAI started with'. The implication was that DeepSeek had created something broadly equivalent to ChatGPT on their own and for much less cost; deriving it from an existing model is a different claim. It's a little like claiming you invented a car when actually you took an existing car and tuned and remodelled it - the end result may be impressive and useful and better than the original, but it's not really a new invention.
    
    tasuki 9 months ago
    
    Is it even possible to "invent a car" in the 21st century? When creating a car, you will necessarily be highly influenced by existing cars.
    
    danparsonson 9 months ago
    
    No and that's not the point I'm making; cloning the technology is not the same as cloning the data. The claim was that they trained DeepSeek for a fraction of the cost that OpenAI spent training ChatGPT, but if one was trained off the web and the other was trained off the trained data of the first, then it's not a fair comparison.
  - freehorse 9 months ago
    
    It is not as if they are not open about how they did it. People are actually working on reproducing their results as they describe in the papers. Somebody has already reproduced the r1-zero rl training process on a smaller model (linked in some comment here).
    Even if o1 specifically was used (which is in itself doubtful), it does not mean that this was the main reason that r1 succeeded/it could not have happened without it. The o1 outputs hides the CoT part, which is the most important here. Also we are in 2025, scratch does not exist anymore. Creating better technology building upon previous (widely available) technology has never been a controversial issue.
  - tw1984 9 months ago
    
    > reproduce the OpenAI capabilities from scratch
    who cares. even if the claim is true, does that make the open source model less attractive?
    in fact, it implies that there is no moat in this game. openai can no longer maintain its stupid valuation, as other companies can just scrape its output and build better models at much lower costs.
    everything points to the exact same end result - DeepSeek democratized AI, OpenAI's old business model is dead.
    
    glenstein 9 months ago
    
    >even if the claim is true, does that make the open source model less attractive?
    Yes! Because whether they reproduced those capabilities independently or copying them from relying on downstream data has everything to do with whether they're actually state of the art.
- Mengkudulangsat 9 months ago
  
  That's how I understand it too.
  If your own API can leak your secret sauce without any malicious penetration, well, that's on you.
- rubslopes 9 months ago
  
  Additionally, I was under the impression that all those Chinese models were being trained using data from OpenAI and Anthropic. Were there not some reports that Qwen models referred to themselves as Claude?
JTyQZSnP3cQGa8B 9 months ago

> OpenAI's results were hard earned after all
DDOSing web sites and grabbing content without anyone's consent is not hard earned at all. They did spent billions on their thing, but nothing was earned as they could never do that legally.
- glenstein 9 months ago
  
  I understand the temptation to go there, but I think it misses the point. I have no qualms at all with the idea that the sum total of intelligence distributed across the internet was siphoned away from creators and piped through an engine that now cynically seeks to replace them. Believe me, I will grab my pitchfork and march side by side with you.
  But let's keep the eye on the ball for a second. None of that changes the fact that what was built was a capability to reflect that knowledge in dynamic and deep ways in conversation, as well as image and audio recognition.
  And did Deepseek also build that? From scratch? Because they might not have.
  - rakejake 9 months ago
    
    Look at it this way. Even OpenAI uses their own models' output to train subsequent models. They do pay for a lot of manual annotations but also use a lot of machine generated data because it is cheaper and good enough, especially from the bigger models.
    So say DS had simply published a paper outlining the RL technique they used, and one of Meta, Google or even OpenAI themselves had used it to train a new model, don't you think they'd have shouted off the rooftops about a new breakthrough? The fact that the provenance of the data is from a rival's model does not negate the value of the research IMHO.
- scotty79 9 months ago
  
  More like hard bought and hard stolen.
soerxpso 9 months ago

> If they just benefited from repurposing OpenAI data, that's different than having achieved an engineering breakthrough
One way or another, they were able to create something that has WAY cheaper inference costs than o1 at the same level of intelligence. I was paying Anthropic $15/1M tokens to make myself 10x faster at writing software, which was coming out to $10/day. O1 is $60/1M tokens, which for my level of usage would mean that it costs as much as a whole junior software engineer. DeepSeek is able to do it for $2.50/1M tokens.
Either OpenAI was taking a profit margin that would make the US Healthcare industry weep, or DeepSeek made an engineering breakthrough that increases inference efficiency by orders of magnitude.
- glenstein 9 months ago
  
  And full credit to them for a potential efficiency breakthrough if that's what we are seeing.
the_duke 9 months ago

These aren't mutually exclusive.
It's been known for a while that competitors used OpenAI to improve their models, that's why they changed the TOS to forbid it.
That doesn't mean the deep seek technical achievements are less valid.
- glenstein 9 months ago
  
  >That doesn't mean the deep seek technical achievements are less valid.
  Well, that's literally exactly what it would mean. If DeepSeek relied on OpenAI’s API, their main achievement is in efficiency and cost reduction as opposed to fundamental AI breakthroughs.
  - obmelvin 9 months ago
    
    Agreed. They accomplished a lot with distillation and optimization - but there's little reason to believe you don't also need foundational models to keep advancing. Otherwise won't they run into issues training on more synthetic data?
    In a way this is something most companies have been doing with their smaller models, DeepSeek just supposedly* did it better.
epolanski 9 months ago

I really don't see a correlation here to be honest.
Eventually all future AIs will be produced with synthetic input, the amount of (quality) data we humans can produce is quite limited.
The fact that the input of one AI has been used in the training of another one seems irrelevant.
- glenstein 9 months ago
  
  The issue isn’t just that AI trained on AI is inevitable it's whose AI is being used as the base layer. Right now, OpenAI’s models are at the top of that hierarchy. If Deepseek depended on them, it means OpenAI is still the upstream bottleneck, not easily replaced.
  The deeper question is whether Deepseek has achieved real autonomy or if it’s just a derivative work. If the latter, then OpenAI still holds the keys to future advances. If Deepseek truly found a way to be independent while achieving similar performance, then OpenAI has a problem.
  The details of how they trained matter more than the inevitability of synthetic data down the line.
  - janalsncm 9 months ago
    
    > whether Deepseek has achieved real autonomy or if it’s just a derivative work
    This question is malformed, imo. Every lab is doing derivative work. OpenAI didn’t invent transformers, Google did. Google didn’t invent neural networks or back propagation.
    If you mean whether OAI could have prevented DS from succeeding by cutting off their API access, probably not. Maybe they used OAI for supervised fine tuning in certain domains, like creative writing, which are difficult to formally verify (although they claim to have used one of their own models). Or perhaps during human preference tuning at the end. But either way, there are many roads to Rome, and OAI wasn’t the only game in town.
  - epolanski 9 months ago
    
    > then OpenAI still holds the keys to future advances
    Point is, those future advances are worthless. Eventually anybody will be able to feed each other's data for the training.
    There's no moat here. LLMs are commodities.
    
    glenstein 9 months ago
    
    If LLMs were already pure commodities, OpenAI wouldn't be able to charge a premium, and DeepSeek wouldn’t have needed to distill their model from OpenAI in the first place. The fact that they did proves there’s still a moat—just maybe not as wide as OpenAI hoped.
janalsncm 9 months ago

IMO the important “narrative” is the one looking forward, not backwards. OpenAI’s valuation depends on LLMs being prohibitively difficult to train and run. Deepseek challenges that.
Also, if you read their papers it’s quite clear there are several important engineering achievements which enabled this. For example multi head latent attention.
plantwallshoe 9 months ago

Yeah what happens when we remove all financial incentive to fund groundbreaking science?
It’s the same problem with pharmaceuticals and generics. It’s great when the price of drugs is low, but without perverse financial incentives no company is going to burn billions of dollars in a risky search for new medicines.
- amarcheschi 9 months ago
  
  In this case, these cures (llms) are medicines in search for a disease to cure. I got Ai shoved everywhere, where I just want it to aid in my coding. Literally, that's it. They're also good at summarizing emails and similar things, but I know nobody who does that. I wouldn't trust an Ai reading and possibly hallucinate emails
- jjcob 9 months ago
  
  Then we just have to fund research by giving grants to universities and research teams. Oh wait a sec: That's already what pretty much every government in the world is doing anyway!
nprateem 9 months ago

Of course. How else would Americans justify their superiority (and therefore valuations) if a load of foreigners for Christ's sake could just out innovate them?
They had to be cheating.
- dang 9 months ago
  
  Please don't take HN threads into nationalistic flamewar. It's not what this site is for, and destroys what it is for.
  https://news.ycombinator.com/newsguidelines.html
  p.s. yes, that goes both ways - that is, if people are slamming a different country from an opposite direction, we say the same thing (provided we see the post in the first place)
  - LPisGood 9 months ago
    
    I see where you’re coming from but that comment didn’t strike me as particularly inflammatory.
    
    dang 9 months ago
    
    I'm likely more sensitive to the fire potential on account of being conditioned by the job.
    Part of it is the form of the comment, btw - that one was entirely a sequence of indignation tropes.

daft_pink 9 months ago

This reminds me of the railroads, where once railroads were invented, there was a huge investment boom of eveyrone trying to make money of the railroads, but the competition brought the costs down where the railroads weren’t the people who generally made the money and got the benefit, but the consumers and regular businesses did and competition caused many to fail.

AI is probably similar where the Moore’s law and advancement will eventually allow people to run open models locally and bring down the cost of operation. Competiition will make it hard for all but one or two players to survive and Nvidia, OpenAI, Deepseek, etc most investments in AI by these large companies will fail to generate substantial wealth but maybe earn some sort of return or maybe not.

floatrock 9 months ago

The railroads drama ended when JP Morgan (the person, not yet the entity) brought all the railroad bosses together, said "you all answer to me because I represent your investors / shareholders", and forced a wave of consolidation and syndicates because competition was bad for business.
Then all the farmers in the midwest went broke not because they couldn't get their goods to market, but because JP Morgan's consolidated syndicates ate all their margin hauling their goods to market.
Consolidation and monopoly over your competition is always the end goal.
- DrScientist 9 months ago
  
  > Consolidation and monopoly over your competition is always the end goal.
  Surely that's only possible when you have a large barrier to entry?
  What's going to be that barrier in this case - cos it turns out not to be neither training costs/hardware or secret expertise.
  - _DeadFred_ 9 months ago
    
    Government regulation.
    'Can't have your data going to China'
    'Can't allow companies that do censorship aligned with foreign nations'
    'This company violated our laws and used an American company's tech for their training unfairly'
    And the government choosing winners.
    'The government in announcing 500 billion going to these chosen winners, anyone else take the hint, give up, you won't get government contracts but will get pressure'.
    Good thing nobody is making these sorts of arguments today.
    
    DrScientist 9 months ago
    
    Surely that will end in fragmentation along national lines if monopolies are defined by governments.
    Sure US economic power has a long reach right now because of the importance of the dollar etc - but the more it uses that to bully, the more countries are making sure they are independent.
    
    astrange 9 months ago
    
    The government isn't giving 500 billion to anyone. They just let Trump announce a private deal he has no involvement.
    
    _DeadFred_ 9 months ago
    
    Correct, as I stated the government is just giving their 'blessing'.
  - floatrock 9 months ago
    
    You figure that out and the VC's will be shovelling money into your face.
    I suspect the "it ain't training costs/hardware" bit is a bit exagerated since it ignores all the prior work that DeepSeek was built on top of.
    But, if all else fails, there's always the tried-and-true approaches: regulatory capture, industry entrenchment, use your VC bucks to be the last one who can wait out the costs the incumbents do face before they fold, etc.
    
    jaredklewis 9 months ago
    
    > I suspect the "it ain't training costs/hardware" bit is a bit exagerated since it ignores all the prior work that DeepSeek was built on top of.
    How does it ignore it? The success of Deepseek proves that training costs/hardware are definitely NOT a barrier to entry that protects OpenAI from competition. If anyone can train their model with ChatGPT for a fraction of the cost it took to train ChatGPT and get similar results, then how is that a barrier?
    
    baq 9 months ago
    
    Can anyone do that though? You need the tokens and the pipelines to feed them to the matmul mincers. Quoting only dollar equivalent of GPU time is disingenuous at best.
    That’s not to say they lie about everything, obviously the thing works amazingly well. The cost is understated by 10x or more, which is still not bad at all I guess? But not mind blowing.
    
    maigret 9 months ago
    
    Even if that's 10x, that's easy to counter. $50M can be invested by almost anyone. There are thousands of entities (incl. governments, even regional ones) who could easily bring such capital.
  - antisthenes 9 months ago
    
    > Surely that's only possible when you have a large barrier to entry?
    As you grow bigger, you create barriers to entry where none existed before, whether intentionally or unintentionally.
  - tdb7893 9 months ago
    
    So I'm not an expert in this but even with DeepSeek supposedly reducing training costs isn't the estimate still in the millions (and that's presumably not counting a lot of costs)? And that wouldn't be counting a bunch of other barriers for actually building the business since training a model is only one part, the barrier to entry still seems very high.
    Also barriers to entry aren't the only way to get a consolidated market anyway.
    
    layer8 9 months ago
    
    About your first point, IMO the usefulness of AI will remain relatively limited as long as we don’t have continuously learning AI. And once we have that, the disparity between training and inference may effectively disappear. Whether that means that such AI will become more accessible/affordable or less is a different question.
    
    sitkack 9 months ago
    
    We have that now, DeepSeek just proved it.
  - yoyohello13 9 months ago
    
    The large syndicate will create the barriers. Either via laws, or if that fails violence.
- jonstewart 9 months ago
  
  I just read _The Great River_ by Boyce Upholt, a history of the Mississippi river and human management thereof. It was funny how the railroads were used as a bogeyman to justify continued building of locks, dams, and other control structures on the Mississippi and its tributaries, long after shipping commodities down river had been supplanted by the railroads.
- boringg 9 months ago
  
  This moment was also historically significant because it demonstrated how financial power (Morgan) could control industrial power (the railroads). A pattern that some say became increasingly important in American capitalism.
- sitkack 9 months ago
  
  This is why we saw the market correction, because the AI hegemony has been cracked.
- mrdevlar 9 months ago
  
  Which is the exact goal of the current wave of Tech oligarchy also.
mjburgess 9 months ago

For the curious, it was vertical integration in the railroad-oil/-coal industry which is where the money was made.
The problem for AI is the hardware is commodified and offers no natural monopoly, so there isn't really anything obvious to vertically integrate-towards-monopoly.
- fullshark 9 months ago
  
  Aren’t we approaching a scenario where the software is commodified (or at least “good enough” software) and the hardware isn’t (NVIDIA GPUs have defined advantages)
  - mjburgess 9 months ago
    
    I think the lesson of DeepSeek is 'no' -- that by software innovation (ie., dropping below CUDA to programming the GPU directly, working at 8bit, etc.) you can trivialise the hardware requirement.
    However I think the reality is that there's only so much coal to be mined, as far as LLM training goes. When we're at "very dimishing returns" SoC/Apple/TSMC-CPU innovations will deliver cheap inference. We only really need a M4 Ultra with 1TB RAM to hollow-out the hardware-inference-supplier market.
    Very easy to imagine a future where Apple releases a "Apple Intelligence Mac Studio" with the specs for many businesses to run arbitrary models.
    
    daft_pink 9 months ago
    
    I really hope that apple realizes soon there is a market for Mac Pro/Mac Studio with a RAM in the TBs for AI Workloads under $10k and a bunch of GPU cores.
    
    jppope 9 months ago
    
    there was a company that recently built a desktop GPU for that exact thing. I'll see if I can find it
    
    exe34 9 months ago
    
    https://cerebras.ai/ ?
  - duped 9 months ago
    
    Compute is literally being sold as a commodity today, software is not.
    
    phkahler 9 months ago
    
    >> Compute is literally being sold as a commodity today, software is not.
    The marginal cost of software is zero. You need some kind of perceived advantage to get people to pay for it. This isn't hard, as most people will pay a bit for big-name vs "free". That could change as more open source apps become popular by being awesome.
    
    duped 9 months ago
    
    Marginal cost has nothing to do with it - you can buy and sell compute like you could corn and beef at scale. You can't buy and sell software like that. In fact I'm surprised we don't have futures markets for things like compute and object storage.
yonran 9 months ago

I think a better analogy than railroads (which own the land that the track sits on and often valuable land around the station) is airlines, which don’t own land. I recall a relevant Warren Buffett letter that warned about investing hundreds of millions of dollars into capital with no moat:
> Similarly, business growth, per se, tells us little about value. It's true that growth often has a positive impact on value, sometimes one of spectacular proportions. But such an effect is far from certain. For example, investors have regularly poured money into the domestic airline business to finance profitless (or worse) growth. For these investors, it would have been far better if Orville had failed to get off the ground at Kitty Hawk: The more the industry has grown, the worse the disaster for owners.
https://www.berkshirehathaway.com/letters/1992.html
rgbrgb 9 months ago

I think that's a very possible outcome. A lot of people investing in AI are thinking there's a google moment coming where one monopoly will reign supreme. Google has strong network effects around user data AND economies of scale. Right now, AI is 1-player with much weaker network effects. The user data moat goes away once the model trains itself effectively and the economies of scale advantage goes away with smart small models that can be efficiently hosted by mortals/hobbyists. The DeepSeek result points to both of those happening in the near future. Interesting times.
UncleOxidant 9 months ago

> where the Moore’s law and advancement will eventually allow people to run open models locally
Probably won't be Moore's law (which is kind of slowing down) so much as architectural improvements (both on the compute side and the model side - you could say that R1 represents an architectural improvement of efficiency on the model side).
lastofthemojito 9 months ago

I saw a thought-provoking post that similarly compared LLM makers to the airlines: https://calpaterson.com/porter.html
taco_emoji 9 months ago

Main difference is that railroads are actually useful

me551ah 9 months ago

OpenAI is going after a company that open sourced their model, by distilling from their non-open AI?

OpenAI talks a lot about the principles of being Open, while still keeping their models closed and not fostering the open source community or sharing their research. Now when a company distills their models using perfectly allowed methods on the public internet, OpenAI wants to shut them down too?

High time OpenAI changes their name to ClosedAI

alexathrowawa9 9 months ago

The name OpenAI gets more ridiculous by the day
Would not be surprised if they do a rebrand eventually
- bazmattaz 9 months ago
  
  I was thinking about this the other day but I highly doubt they would rebrand name. They’re borderline a household name now - at least ChatGPT is. OpenAI is the face of AI - at least to people who don’t follow the industry

readyplayernull 9 months ago

Do you remember when Microsoft was caught scrapping data from Google:

https://www.wired.com/2011/02/bing-copies-google/

They don't care, T&C and copyright is void unless it affects them, others can go kick rocks. Not surprising they and OpenAI will do a legal battle over this.

Ciantic 9 months ago

I'm not being sarcastic, but we may soon have to torrent DeepSeek's model. OpenAI has a lot of clout in the US and could get DeepSeek banned in western countries for copyright.

timeon 9 months ago

> US and could get DeepSeek banned in western countries for copyright
If US is going to proceed with trade war on EU, as it was planning anyway, then DeepSeek will be banned only in US. Seems like term "western countries" is slowly eroding.
- bbor 9 months ago
  
  Great point. Plus, the revival of serious talk of the Monroe Doctrine (!!!) in the U.S. government lends a possibly completely-new meaning to "western countries" -- i.e. the Americas...
  - surgical_fire 9 months ago
    
    Except the US has only contempt for anything south of Texas. Perhaps "western countries" will be reduced to US and Canada.
    Many countries in Latin America have better relations and more robust trade partnerships with China.
    As for the EU, I think it will be great for it to shed its reliance on the US, and act more independently from it.
    
    ta1243 9 months ago
    
    The US is talking about annexing Canada, so "western countries" means the USA, which if continuing down this path long enough will become a pariah
    
    mordae 9 months ago
    
    This always reminds me of the Fallout opening video.
  - marcosdumay 9 months ago
    
    Only if they do it by force.
    Trump has already managed to completely destroy the US reputation within basically the entire continent¹. And he seems intent on creating a commercial war against all the countries here too.
    1 - Do not capture and torture random people on the street if you want to maintain some goodwill. Even if you have reasons to capture them.
    
    bbor 9 months ago
    
    Yeah... I don't think goodwill was ever a very central part of the Monroe doctrine. Its imperial expansionism, plain n' simple. Embargo + pressure who you can, depose any governments that resist, threaten the rest into silent compliance.
    Scary times.
alchemist1e9 9 months ago

I think most likely all sorts of data and models need to have a decentralized LLM data archive via torrents etc.
It’s not limited to the models themselves but also OpenAI will probably work towards shutting down access to training data sets also.
imho it’s probably an emergency all hand on deck problem.
aerhardt 9 months ago

Unfathomable to me that they'd make themselves look so foolish by trying to ban a piece of software.
- forgotoldacc 9 months ago
  
  It wouldn't be foolish. The US has an active cult of personality, and whatever the leader says, half the country believes it unquestioningly. If OpenAI is said to be protecting America and DeepSeek is doing terrible, terrible things to the children (many smart people are saying it), there'll be an overnight pivot to half the country screaming for it to be banned and harassing anyone who says otherwise.
  Who cares if some people think you look foolish when you have a locked down 500 billion dollar investment guarantee?
sergiotapia 9 months ago

that would be suicide - that company only exists because they stole content for every single person, website and media company on the planet.

dragonwriter 9 months ago

Hey, OpenAI, so, you know that legal theory that is the entire basis of your argument that any of your products are legal? "Training AI on proprietary data is a use that doesn't require permission from the owner of the data"?

You might want to consider how it applies to this situation.

mhitza 9 months ago

This is funny because its.

1. Something I'd expect to happen.

2. Lived through a similar scenario in 2010 or so.

Early in my professional career I've worked for a media company that was scraping other sites (think Craigslist but for our local market) to republish the content on our competing website. I wasn't working on that specific project, but I did work on an integration on my teams project where the scraping team could post jobs on our platform directly. When others started scraping "our content" there were a couple of urgent all hands on deck meetings scheduled, with a high level of disbelief.

spyckie2 9 months ago

Classic.
kigiri 9 months ago

Nice one, thank you for sharing !

1970-01-01 9 months ago

DeepSeek have more integrity than 'Open'AI by not even pretending to care about that.

jampekka 9 months ago

And seem to be more actively fulfilling the mission that 'Open'AI pretends to strive for.
- pixelpoet 9 months ago
  
  Exactly, they actually opened up the model and research, which the "Open" company didn't, and merely adjusted some of their pricing tiers to try to combat commercially (but not without mumbling something like "yeah, we totally had these ideas too"). Now every single Meta, OpenAI etc engineer is trying to copy DeepSeek's innovations, and their first act is to... complain about copyright infringement, of all things?! What an absolute clown party, how can these people take themselves seriously, do they just have zero comprehension of what hypocrisy is or what's going on here...
  I can scarcely process all the levels of irony involved, the irony-o-meter is pegged and I can't get the good one from the safe because I'm incapacitated from laughter.
  - tim333 9 months ago
    
    Altman was in a bit of a tricky position in that he figured OpenAI would need a lot of money for compute to be able to compete but it was hard to get that while remaining open. DeepSeek benefit from being funded from their own hedge fund. I wonder if part of their strategy is crack AI and then have it trade the markets?
    
    jampekka 9 months ago
    
    The last (only?) language model OpenAI released openly was GPT-2, and even for that the instruction weighted model was never released. This was in 2019. The large Microsoft deal was done in 2023.

olalonde 9 months ago

If it's true, how is it problematic? It seems aligned with their mission:

> We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.

> We will actively cooperate with other research and policy institutions; we seek to create a global community working together to address AGI’s global challenges.

https://openai.com/charter/

/s, we all know what their true mission is...

sho_hn 9 months ago

While I'm as amused as everyone else - I think it's technically accurate to point out that the "we trained it for $6 mio" narrative is contingent on the done investment by others.

Palmik 9 months ago

When I use NVIDIA GPUs to train a model, I do not consider the R&D cost to develop all of those GPUs as part of my costs.
When I use an API to generate some data, I do not consider the R&D cost to develop the API as part of my costs.
kobalsky 9 months ago

OpenAI has been in a war-room for days searching for a match in the data, and they just came out with this without providing proof.
My cynical opinion is that the traning corpus has some small amount of data generated by OpenAI, which is probably impossible to avoid at this point, and they are hanging on that thread for dear life.
bbqfog 9 months ago

OpenAI's models were also trained on billions of dollars of "free" labor that produced the content that it was trained on.
- sho_hn 9 months ago
  
  Oh, absolutely. I'm not defending OpenAI, I just care about accurate reporting. Even on HN - even in this thread - you see people who came away with the conclusion that DeepSeek did something while "cutting cost by 27x".
  But that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place.
  Smarter reporting and discourse would explain how this iterative process actually works and who is building on who and how, not frame it as two competing from-scratch clean room efforts. It'd help clear up expectations of what's coming next.
  It's a bit similar to how many are saying DeepSeek have demonstrated independence from nVidia, when part of the clever thing they did was figure out how to make the intentionally gimped H800s work for their training runs by doing low-level optimizations that are more nVidia-specific, etc.
  Rarely have I seen a highly technical topic see produce more uninformed snap takes than this week.
  - Palmik 9 months ago
    
    You are underselling or not understanding the breakthrough. They trained 600B model on 15T tokens for <$6/m. Regardless of the provenance of the tokens, this in itself is impressive.
    Not to mention post-training. Their novel GRPO technique used for preference optimization / alignment is also much more efficient than PPO.
    
    sho_hn 9 months ago
    
    Let's call it underselling. :-) Mostly because I'm not sure anyone's independently done the math and we just have a single statement from the CEO. I do appreciate the algorithmic improvements, and the excellent attention-to-performance-in-detail stuff in their implementation (careful treatment of precision, etc.), making the H800s useful, etc. I agree there's a lot there.
  - visarga 9 months ago
    
    > that's a bit like saying that by painting a a bare wall green you have demonstrated that you can build green walls 27x cheaper, ignoring the cost of building the wall in the first place
    That's a funny analogy, but in reality DeepSeek did reinforcement learning to generate chain of thought, which was used in the end to finetune LLMs. The RL model was called DeepSeek-R1-Zero, while the SFT model is DeepSeek-R1.
    They might have boostrapped the Zero model with some demonstrations.
    > DeepSeek-R1-Zero struggles with challenges like poor readability, and language mixing. To make reasoning processes more readable and share them with the open community, we explore DeepSeek-R1, a method that utilizes RL with human-friendly cold-start data.
    > Unlike DeepSeek-R1-Zero, to prevent the early unstable cold start phase of RL training from the base model, for DeepSeek-R1 we construct and collect a small amount of long CoT data to fine-tune the model as the initial RL actor. To collect such data, we have explored several approaches: using few-shot prompting with a long CoT as an example, directly prompting models to generate detailed answers with reflection and verification, gathering DeepSeek-R1Zero outputs in a readable format, and refining the results through post-processing by human annotators.
  - bbqfog 9 months ago
    
    I don't agree. Walls are physical items so your example is true, but models are data. Anyone can train off of these models, that's the current environment we exist in. Just like OpenAI trained on data that has since been locked up in a lot of cases. In 2025 training models like Deepseek is indeed 27x cheaper, that includes both their innovations and the existence of new "raw material" to do such a thing.
    
    sho_hn 9 months ago
    
    I don't think we disagree at all, actually!
    What I'm saying is that in the media it's being portrayed as if DeepSeek did the same thing OpenAI did 27x cheaper, and the outsized market reaction is in large parts a response to that narrative. While the reality is more that being a fast-follower is cheaper (and the concrete reason is e.g. being able to source training data from prior LLMs synthetically, among other things), which shouldn't have surprised anyone and is just how technology in general trends.
    The achievement of DeepSeek is putting together a competent team that excels at end-to-end implementation, which is no small feat and is promising wrt/ their future efforts.
    
    meiraleal 9 months ago
    
    How much money a third company would need to spend to achieve what OpenAI achieved to compete with them, 5billion or 6million?
scotty79 9 months ago

The opposite, is claiming that OpenAI could have now built better performing, cheaper to run model (when compared to what they published) training it at 1% cost on output of their previous models. ... But they chose not to do it.
freehorse 9 months ago

That is the case anyway for training any llm. It is contingent on the work done by all those who produced the data.

concerndc1tizen 9 months ago

Is OpenAI claiming copyright ownership over the generated synthetic data?

That would be a dangerous precedent to establish.

If it's a terms of service violation, I guess they're within their rights to terminate service, but what other recourse do they have?

Other than that, perhaps this is just rhetoric aimed at introducing restrictions in the US, to prevent access to foreign AI, to establish a national monopoly?

thorum 9 months ago

> “It is (relatively) easy to copy something that you know works,” Altman tweeted. “It is extremely hard to do something new, risky, and difficult when you don’t know if it will work.”

The humor/hypocrisy of the situation aside, it does seem to be true that OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …) and then other companies reproduce their work, and get more credit because they actually publish the research.

Unfortunately for them it’s challenging to profit in the long term from being first in this space and the time it takes for each new idea to be reproduced is getting shorter.

Hatchback7599 9 months ago

Reminds me of the Bill Gates quote when Steve Jobs accused him of stealing the ideas of Windows from Mac:
Well, Steve... I think it’s more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it.
Xerox could be seen as Google, whose researchers produced the landmark Attention Is All You Need paper, and the general public, who provided all of the training data to make these models possible.
rndphs 9 months ago

> OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …)
As far as I can tell o1 was based on Q-star, which could likely be Quiet-STaR, a CoT RL technique developed at Stanford that OpenAI may have learned about before it got published. Presumably that's why they never used the Q-Star name even though it had garnered mystique and would have been good for building hype. This is just speculation, but since OpenAI haven't published their technique then we can't know if it really was their innovation.
turtlesdown11 9 months ago

> other companies reproduce their work, and get more credit because they actually publish the research.
I don't understand, you mean OpenAI isn't releasing open models and openly publishing their research?
- Tostino 9 months ago
  
  Are you being sarcastic (honestly, it's hard to tell after reading as many uninformed takes in the past week as I have).
  No, they aren't (other than whisper).
  Their "papers" are closer to marketing materials. Very intentionally leaving out tons of technical information.
  - KolmogorovComp 9 months ago
    
    They are being sarcastic.
  - sota_pop 9 months ago
    
    /s
joe_the_user 9 months ago

The humor/hypocrisy of the situation aside, it does seem to be true that OpenAI is consistently the one coming up with new ideas first (GPT 4, o1, 4o-style multimodality, voice chat, DALL-E, …) and then other companies reproduce their work, and get more credit because they actually publish the research
I claim one just can't put the humor/hypocrisy aside that easily.
What OpenAI did with the release of ChatGPT is productize research that was open and ongoing with Deepmind and other leading at least as much. And everything after that was an extension of the basic approach - improved, expanded but ultimately the same sort of beast. One might even say the situation of OpenAI to DeepMind was like Apple to Xerox. Productizing is nothing to sneeze at - it requires creativity and work to productize basic research. But naturally get end-users who consider the productizers the "fountain heads", who overestimate the productizers because products are all they see.
- Davidzheng 9 months ago
  
  They RLHF'd first no?
weego 9 months ago

Boy who stole test papers complains about child copying his answers.
- FridgeSeal 9 months ago
  
  No you don’t understand, AI is “dangerous” and only him and his uber rich billionaire mates should get to control it!
sota_pop 9 months ago

I may be wrong, but to my knowledge OpenAI: - did not invent transformer architecture - did not invent diffusion architecture - did not come up with the idea for multi-modality - did not invent the notion/architecture of the latest “agentic” models
They simply were the first to aggressively pursue scaling the transformer to the extent that is normal for the industry today. Although this has proven to produce interesting results, “simply adding scale” is, in my view, the least interesting development in modern ML. Giving credit where it’s due, they MAY have popularized the RLHF methodology, but I don’t recall them inventing that either?
(feel free to point out any of the above that I falsely attributed to NOT OpenAI.)
Additionally I seem to remember in an interview with Altman circa late ‘21 where he explains that the spirit of “OpenAI” and how their only goal is pursuing AGI, and “should someone else come up with a more promising path to get there, we would stop what we’re doing and help them”. I couldn’t find a reference to this interview, but anyone else, please feel free to share (I think it was a youtube link). - fast forward to 2025 and now “OpenAI” is the least open large contributor and indiscernible from your run-of-the-mill AI/ML valley startup insofar as they’re referring to others as “competitors” as opposed to collaborators.. interesting times…
- crackalamoo 9 months ago
  
  As far as I can tell, OpenAI really did invent RLHF in its current form. https://arxiv.org/pdf/1909.08593
actuallyalys 9 months ago

There’s some truth in that, but isn’t making a radically cheaper version also a new idea that deepseek didn’t know whether it would work? I mean, there was already research into distillation, but there was already research into some of (most of?) OpenAI’s ideas.
- janalsncm 9 months ago
  
  Yes, for people who look into the research Deepseek released, there are a good number of novelties which enabled much cheaper R&D. For example, improvements to Mixture of Experts modules and Multi-head Latent Attention. If you have infinite money, you don’t need to innovate there, but DeepSeek didn’t.
namuol 9 months ago

The eye-watering funding numbers proposed by Altman in the past and more recently with “Stargate” suggests a publicly-funded research pivot is not out of the question. Could see a big defense department grant being given. Sigh.
- WhyNotHugo 9 months ago
  
  I don't see any reason to assume that "publicly funded" will imply that the research is public. Although I'd be more than happy to be wrong on this one.
mistercheph 9 months ago

Not really, they just put their eye to where everyone knows the ball is going and publish fake / cherrypicked results and then pretend like they got there first (o1, gpt voice, sora)
spencerflem 9 months ago

Fortunately, OpenAI doesn't need to make money because they are a nonprofit dedicated to the safe and transparent advancement of AI for all of humanity
- mjburgess 9 months ago
  
  ...somewhere a yacht salesman cried out in terror

divbzero 9 months ago

I was wondering if this might be the case, similar to how Bing’s initial training included Google’s search results [1]. I’d be curious to see more details of OpenAI’s evidence.

It is, of course, quite ironic for OpenAI to indiscriminately scrape the entire web and then complain about being scraped themselves.

[1]: https://searchengineland.com/google-bing-is-cheating-copying...

esskay 9 months ago

Hard to really have any sympathy for OpenAI's position when they're actively stealing content, ignoring requests to stop then spending huge amounts to get around sites running ai poisoning scripts, making it clear they'll still take your content regardless of if you consent to it.

michaelmarkell 9 months ago

Can someone with more expertise help me understand what I'm looking at here? https://crt.sh/?id=10106356492
It looks like Deepseek had a subdomain called "openai-us1.deepseek.com". What is a legitimate use-case for hosting an openai proxy(?) on your subdomain like this?
Not implying anything's off here, but it's interesting to me that this OpenAI entity is one of the few subdomains they have on their site
- gkbrk 9 months ago
  
  Could just be an OpenAI-compatible endpoint too. A lot of LLM tools use OpenAI compatible APIs, just like a lot of Object Storage tools use S3 compatible APIs.

bhouston 9 months ago

The US government likely will favor a large strategic company like OpenAI instead of individual's copyrights, so while ironic, the US government definitely doesn't care.

And the US government is also likely itching to reduce the power of Chinese AI companies that could out compete US rivals (similar to the treatment of BYD, TikTok, solar panel manufacturers, network equipment manufacturers, etc), so expect sweeping legislation that blocks access to all Chinese AI endeavours to both the US and then soon US allies/West (via US pressure.)

The likely legislation will be on the surface justified both by security concerns and by intellectual property concerns, but ultimately it will be motivated by winning the economic competition between China and the US and it will attempt to tilt the balance via explicitly protectionist policies.

derektank 9 months ago

>The US government likely will favor a large strategic company like OpenAI instead of individual's copyrights
Even if we assume this is true, Disney and Netflix are both currently worth more than OpenAI and both rely on the strict enforcement of US copyright law. I do not think it is so obvious which powers that be have the better lobbying efforts and, currently, it's looking like this question will mostly be adjudicated by the courts, not Congress, anyways.
- bhouston 9 months ago
  
  I don't think OpenAI stole from Disney or Netflix. Rather OpenAI stole from individual artists and YouTube and other social media who users do not really have any lobbying power.
  So I think OpenAI, Disney and Netflix win together. Big companies tend to win.
  - mjburgess 9 months ago
    
    > What are the first words of the disney movie, "Aladdin" ?
    The first words of Disney's Aladdin (1992) are spoken by the *Peddler*, the mysterious merchant at the beginning of the film. He says:
    "Ah, Salaam and good evening to you, worthy friend. Please, please, come closer..."
    He then continues with: "Too close! A little too close. There. Welcome to Agrabah. City of mystery, of enchantment, and the finest merchandise this side of the River Jordan, on sale today! Come on down!"
    This opening sets the stage for the story, introducing the magical and bustling world of Agrabah.
  - derektank 9 months ago
    
    Disney owns ABC News; OpenAI almost certainly scraped their text data
    
    bhouston 9 months ago
    
    I agree with you.
  - worik 9 months ago
    
    > Rather OpenAI stole from individual artists and YouTube and other social media
    "stole"?
    They consumed publicly available material on the Internet
    I am no fan of these billionaire capitalists and their henchpersons but condem them for their multitude of sins.
    Consuming publicly available Internet resources is not one of them. IMO
    
    da_chicken 9 months ago
    
    Being publicly available does not mean that copyright is invalid. Copyright gives the holders the right to restrict USE, not merely restrict reproduction. Adaptation is also an exclusive right of the copyright holder. You're not allowed to make derivative works.
    
    Terr_ 9 months ago
    
    > They consumed publicly available material on the Internet
    I agree that there are some important distinctions and word-choices to be made here, and that there are problems with equating training to "stealing", and that copyright infringement is not theft, etc.
    That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.
    
    worik 9 months ago
    
    > That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.
    I would use the analogy of those humans learning from the material. Like reading books in the library
    "regurgitate slightly-reworded copies" in my experience using LLMs (not insubstantial) that is an unfairly pejorative take on what they do
    
    jdswain 9 months ago
    
    It's not that they consumed publicly available material, it's that they re-published that information, and sold it.
    
    visarga 9 months ago
    
    They stole the data just as much as a painter steals the view.
    
    rideontime 9 months ago
    
    Who created the view?
    
    visarga 9 months ago
    
    The view is created by every spectator.
    
    esskay 9 months ago
    
    By that logic a copy of source code for a propriatary app that someone has stolen and placed online is immediately free for all to use as they wish.
    Being on the internet doesnt make it yours, or acceptable to take. In the case of OpenAI (and Anthropic) they should be following the long held principle of the robots.txt file on sites, which can be specifically set to tell just them that they may not take your content - they openly ignore that request.
    OpenAI absolutely is stealing from everyone, hence why most will have little sympathy when they complain someone stole from them.
tokioyoyo 9 months ago

I don’t think US government can move fast enough to change the trajectory. Also it doesn’t help that basically every government is second guessing their alliance with the US. It’s not an industry that can ruin local industries either (like cheap BYD is bad for German cars).
It’s a very fun thing to watch from the sidelines right now, if I’ll be honest.
buyucu 9 months ago

It's too late for that. That ship sailed a long time ago.
The best language model right now is open source. Let that sink in.
- _pferreir_ 9 months ago
  
  DeepSeek is not Open Source. That's like saying that Microsoft Edge is Open Source, as you can download it for free.
  https://huggingface.co/blog/open-r1

oli5679 9 months ago

this is pretty ridiculous

A. below is a list of OpenAI initial hires from Google. It's implausible to me that there wasn't quite significant transfer of Google IP

B. google published extensively, including the famous 'attention is all you need' paper, but open-ai despite its name, has not explained the breakthroughs that enabled O1. It has also switched from a charity to a for-profit company.

C. Now this company, with a group of smart, unknown machine learning engineers, presumably paid fractions of what OpenAI are published, has created a model far cheaper, and openly published the weights, many methodological insights, which will be used by OpenAI.

1. Ilya Sutskever – One of OpenAI’s co-founders and its former Chief Scientist. He previously worked at Google Brain, where he contributed to the development of deep learning models, including TensorFlow. 2. Jakub Pachocki – Formerly OpenAI’s Director of Research, he played a major role in the development of GPT-4. He had a background in AI research that overlapped with Google’s fields of interest. 3. John Schulman – Co-founder of OpenAI, he worked on reinforcement learning and helped develop Proximal Policy Optimization (PPO), a method used in training AI models. While not a direct Google hire, his work aligned with DeepMind’s research areas. 4. Jeffrey Wu – One of the key researchers involved in fine-tuning OpenAI’s models. He worked on reinforcement learning techniques similar to those developed at DeepMind. 5. Girish Sastry – Previously involved in OpenAI’s safety and alignment work, he had research experience that overlapped with Google’s AI safety initiatives.

throwaway314155 9 months ago

> A. below is a list of OpenAI initial hires from Google. It's implausible to me that there wasn't quite significant transfer of Google IP
I agree there's hypocrisy but in terms of making a strong argument, you can safely remove your list of persons who (drum roll)... mostly _didn't_ actually work at Google?
dumah 9 months ago

my_ridiculous_list = ["Ilya Sutskever"]

karim79 9 months ago

Oh God. I know exactly how this feels. A few years ago I made a bread hydration and conversion calculator for a friend, and put it up on JSFiddle. My friend, at the time, was an apprentice baker.

Just weeks later, I discovered that others were pulling off similar calculations! They were making great bread with ease and not having to resort to notebooks and calculators! The horror! I can't believe that said close friend of mine would actually share those highly hydraty mathematical formulas with other humans without first requesting my consent </sarc>.

Could it be, that this stuff just ends up in the dumpster of "sorry you can't patent math" or the like?

osigurdson 9 months ago

I do think that distilling a model from another is much less impressive than distilling one from raw text. However, it is hard to say if it is really illegal or even immoral, perhaps just one step further in the evolution of the space.

lemoncookiechip 9 months ago

It's about as illegal as the billions, if not trillions of IPs that ClosedAI infringed to train their own data without consent. Not that they're alone, and I personally don't mind that AI companies do it, but it's still amusing when they get this annoyed at others doing the same thing to them.
- osigurdson 9 months ago
  
  I think they had the advantage of being ahead of the law in this regard. To my knowledge, reading copywritten material isn't (or wasn't illegal) and remains a legal grey area.
  Distilling weights from prompts and responses is even more of a legal grey area. The legal system cannot respond quickly to such technological advancements so things necessarily remain a wild west until technology reaches the asymptotic portion of the curve.
  In my view the most interesting thing is, do we really need vast data centers and innumerable GPUs for AGI? In other words, if intelligence is ultimately a function of power input, what is the shape of the curve?
  - lemoncookiechip 9 months ago
    
    The main issue is that they've had plenty of instances where the LLM outputted copyrighted content verbatim, like it happened with the New York Times and some book authors. And then there's DALL-E, which is baked into ChatGPT and before all the guardrails came up, was clearly trained on copyrighted content to the point it had people's watermarks, as well as their styles, just like Stable Diffusion mixes can do (if you don't prompt it out).
    Like you've put, it's still a somewhat gray area, and I personally have nothing against them (or anyone else) using copyrighted content to train models.
    I do find it annoying that they're so closed-off about their tech when it's built on the shoulders of openness and other people's hard work. And then they turn around and throw Issy fits when someone copies their homework, allegedly.
  - greiskul 9 months ago
    
    > Distilling weights from prompts and responses is even more of a legal grey area.
    Actually unless the law changes this is pretty settled territory in US law. All output of AIs are not copyrightable, and are therefore in the public domain. The only legal avenue of attack OpenAi has is Terms of Service violation, which is a much weaker breach then copyright if it is even true.
  - ttesmer 9 months ago
    
    > if intelligence is ultimately a function of power input, what is the shape of the curve?
    According to a quick google search, the human body consumes ~145W of power over 24h (eating 3000kcals/day). The brain needs ~20% of that so 29W/day. Much less than our current designs of software & (especially) hardware for AI.
    
    osigurdson 9 months ago
    
    I think you mean the brain uses 29W (i.e. not 29W/day). Also, I suspect that burgers are a higher entropy energy source than electricity so perhaps it is even less than that.
  - JTyQZSnP3cQGa8B 9 months ago
    
    Illegally acquiring copyrighted material has always been highly illegal in France and I'm sure most other countries. Disney is another example of how it not grey at all.
- ReptileMan 9 months ago
  
  Is the question of training AI on data fair use settled yet? Because if it is not - it looks like fair use to me.
scotty79 9 months ago

Isn't it more impressive given that training on model output usually leads to worse model?
If they actually figured out how to use output of existing models to build model that outperforms them then it's something that brings us closer to singularity than every other development so far.

TylerJaacks 9 months ago

Cry me a fucking river OpenAI, as if your business model isn't entirely based on this exact same thing.

nelblu 9 months ago

Hahaha I can't stop laughing... i dont know the validity of the claim, but immediately i thought of the British Museum complaining about theft.

grogenaut 9 months ago

there's an exhibit in the BM about how they're proud to be allowing the Egyptian government to take back some of the artifacts the British have been safeguarding for the world while Egypt was going through essentially "troubles".
right next to it is an older exhibit about how the original curator took cuneiform rolls and made them into necklace beads for his wife and rings? for himself.
either someone at the BM has a very british sense of humor or it's a gigantic woosh. I laughed my ass off. People looked at me.
- isaacremuant 9 months ago
  
  The safeguarding propaganda is a a typical go-to of the remnants of the British empire to keep their stolen goods.
  They do it even with the Chile Moais when they never where in any danger.
  It's all lies.

alasr 9 months ago

> OpenAI says it has evidence DeepSeek used its model to train competitor.

> The San Francisco-based ChatGPT maker told the Financial Times it had seen some evidence of “distillation”, which it suspects to be from DeepSeek.

> ...

> OpenAI declined to comment further or provide details of its evidence. Its terms of service state users cannot “copy” any of its services or “use output to develop models that compete with OpenAI”.

OAI share the evidence with the public; or, accept the possibility that your case is not as strong as you're claiming here.

janalsncm 9 months ago

Also, there are so many innovations in their papers (Deepseek math, Deepseek v2/v3, R1) that I honestly wouldn’t even care. They figured out a way to train on only 2048 H800s when big companies are buying them in the hundreds of thousands. They created a new RL algorithm. They improved MoE. They improved the KV cache. They built an super efficient training framework.

kranke155 9 months ago

The very idea that OAI scrapes the entire internet and ignore individual rights and thats ok, but if another company takes the output data from their model, thats a gross violation of the law / TOS - that very idea is evil.

glitchc 9 months ago

[flagged]

dang 9 months ago

Would you please not do this here? We're trying for an opposite sort of conversation.
https://news.ycombinator.com/newsguidelines.html
- glitchc 9 months ago
  
  Sorry dang. I'll do better.
  - dang 9 months ago
    
    Appreciated!

kamranjon 9 months ago

I was just wondering if this is even feasible?

The amount of iterations of training that would be needed for DeepSeek to actually learn anything from OpenAI would seem to be an insane amount of requests from a non-local AI, which you’d think would be immediately obvious to OpenAI just by looking at suspicious requests?

Am I correct in this assumption or am I missing something? Is it even realistic that something like this is possible without a local model?

emsign 9 months ago

"yOu ShOuLdN't TaKe OtHeR pEoPlE's DaTa!1!1" are they mental? How can people at OpenAI lack be so self-righteous and unaware? Is thia arrogance or a mental illness?

fimdomeio 9 months ago

But what is the problem here? Isn’t open AI mission “to ensure that artificial general intelligence benefits all of humanity”? Sounds like success to me.

duchenne 9 months ago

The reasoning happens in the chain of thoughts. But OpenAI (aka ClosedAI) doesn't show this part when you use the o1 model, whether through the API or chat. They hide it to prevent distillation. Deepseek, though, has come up with something new.

manamorphic 9 months ago

Crazy how most people miss this simple logical deduction.

metaxz 9 months ago

I don't understand how OpenAI claims it would have happened. The weights are closed and as far as I read they are not complaining Deepseek hacked them and obtained the weight. So all they could do was to query OpenAI and generate test data. But how much did they query really - I would suppose it would require a huge amount done via an external, paid-for API? Is there any proof of this besides OpenAI saying it? Even if we suppose it is true, I suppose this must have happened via the API so they paid per token etc. So they paid for each and every token of training data. As I understand, the requester owns the copyright on what is generated by OpenAI's models and is free to do what they want.

nbgoodall 9 months ago

I lol'd, from the DeepSeek news release[1]: "Pushing the boundaries of open AI!"

[1]: https://api-docs.deepseek.com/news/news250120

temporallobe 9 months ago

OpenAI is also possibly in violation of many IP laws by scraping the entirety of the internet and using to train their models, so there’s that.

InkCanon 9 months ago

To my understanding, OpenAI won the case where it argued training was covered under fair use and did not infringe on copyright.
- Austiiiiii 9 months ago
  
  Is there any reason they wouldn't rule the same way on DeepSeek training on OpenAI data? After all, one of the big selling points of GPT has been that businesses can freely use the information provided. They're paying for the service, after all. I'd very be interested to know how DeepSeek's usage (very reasonably assuming that they paid for their OpenAI subscription) is any different.
  - ickelbawd 9 months ago
    
    Businesses can’t freely use the information. There are terms of service freely agreed upon by the user which explicitly deny many use cases—training other models is just one. DeepSeek is not an American company nor is their leader in deep with the new administration. It seems far more likely that this will play out like tiktok—they’ll be attacked publicly and banned for national security reasons.
    
    Austiiiiii 9 months ago
    
    On further reading, I'll grant the first point. Although I wonder if they'll have a technical out—say they distilled from several smaller research companies that had distilled from OpenAI for research purposes, which to my understanding would not constitute a violation of the terms of service.
    As for it getting banned, TikTok was banned partly because of credible accounts of it having been used by China to track political enemies. Are we thinking they'll expand the argument on national security to say that any application that transfers data to China is a national security threat? Because that could be a very slippery slope.
    And in any case, such a measure seems like it would only bar access to the DeepSeek app. Surely no one could argue that the underlying open source model, if run locally on American soil, could constitute a security threat, right?

__MatrixMan__ 9 months ago

If they want us to care they can open up their models so we can be the judge.

_moof 9 months ago

This reminds me of a (probably apocryphal) story about fast food chains that made the rounds decades ago: McDonald's invests tons of time into finding the best real estate for new stores; Burger King just opens stores near McDonalds!

nazgulsenpai 9 months ago

About 15 years ago, as CVS Pharmacy expanded into their new, stand-alone properties (in our region), Walgreen's Pharmacy started appearing across the street almost instantaneously. I've seen it happen at 4 separate locations so most certainly not coincidence -- so I believe it :)

crowcroft 9 months ago

The AI companies were happy to take whatever they want and put the onus of proving they were breaking the law onto publishers by challenging them to take things to court.

Don't get mad about possible data theft, prove it in court.

jgrall 9 months ago

It’s not a good look when your technology is replicated for a fraction of the cost, and your response is to smear your competition with (probably) false accusations and cozy up to the US government to tighten already shortsighted export controls. Hubris & xenophobia are not going to serve American companies well. Personally I welcome the Chinese - or anyone else for that matter - developing advanced technologies as long as they are used for good. Humanity loses if we allow this stuff to be “owned” by a handful of companies or a single country.

karim79 9 months ago

So, banning high-powered chips to China has basically had the effect of turning them into extremophiles. I mean, that seems like a good plan </sarc>. Moreover, it is certainly slowing sales of one of the darling companies of the US (NVidia).

I just can't even begin to imagine what will come of this riduculous techno-imperialism/AI arms-race, or whatever you want to call it. It should not be too hard for China to create their own ASICs which do the same, and finally be done with this palaver.

InkCanon 9 months ago

It's like that Dr Phil episode where he meets the guy who created Bum Fights!

selimthegrim 9 months ago

Dr. Phil is riding along with ICE now; I wonder what Bum Fights guy would have to say about that.

whatshisface 9 months ago

It's reasonably likely that a lot of people linked to the federal government want to ban DeepSeek. You can tell it's being presented away from "they gave us a free set of weights" and towards "they destroyed $1T of shareholder value." (By revealing that Microsoft et al. paid way too much to OpenAI et al. for technology that was actually easy to reinvent.)

toomuchtodo 9 months ago

> "they destroyed $1T of shareholder value." (By revealing that Microsoft et al. paid way too much to OpenAI et al. for technology that was actually easy to reinvent.)
The value was highly speculative, an illusion created by PR and sentiment momentum. "Hype value" not real value (unless you're able to realize it and dump those bags on someone else before fundamentals set in). Same thing happening with power companies downstream of the discovery that AI is not going to be a savior of sagging electricity demand. Overdriving the fundamentals is not value destruction, it is "I gambled and lost."
https://www.bloomberg.com/news/articles/2025-01-28/deepseek-... | https://archive.today/mCemf
"In the short run, the market is a voting machine but in the long run, it is a weighing machine."
- DebtDeflation 9 months ago
  
  What they really destroyed was the idea that OpenAI would be able to charge $200/month for their ChatGPT Pro subscription which includes o1. That was always ridiculous IMO. The Free tier and $20/month Plus tier along with their API business (minus any future plan to charge a ridiculous amount for API access to o1) will be fine.
  - toomuchtodo 9 months ago
    
    > The Free tier and $20/month Plus tier along with their API business (minus any future plan to charge a ridiculous amount for API access to o1) will be fine.
    Do the unit economics make this sustainable?
    
    DebtDeflation 9 months ago
    
    If only there were a way to make the models more efficient. Oh wait.
    
    jl6 9 months ago
    
    But doesn’t Deepseek’s innovation apply only to training, not inference?
    
    Zacharias030 9 months ago
    
    Actually no! If we take their paper at face value, the crucial innovation to get a strong model with efficiency is their much reduced KV cache and their MoE approach: - where a standard model needs to store two large vectors for each token at inference time (and load/store those over and over from memory) deepseek v3/R1 only stores one smaller vector C that is a „compression“ from which the large k,v vectors can be decoded on the fly. - They use a fairly standard Mixture of Expert (MoE) approach, which works well in training with their tricks, but whose inference time advantages are immediate and equal to all other MoE techniques, which is to say that from ~85% of the 600B+ params that are inside the MoE layers, the model at each token inference step will only pick a small fraction to use. This reduces FLOPs and memory io by a large factor in comparison to a so-called dense model where all weights are used for every token (cf Llama 3 405B)
    
    freeone3000 9 months ago
    
    Reducing R&D expense also reduces breakeven price.
- scarface_74 9 months ago
  
  The two podcasters who do the Acquired podcast spoke to Ballmer about some of Microsoft’s failed initiatives and acquisitions. He told them that at the end of the day “it’s only money”.
  All of the BigTech companies have enough cash flow from profitable lines of business to make speculative bets.
  - azemetre 9 months ago
    
    It must be EZ mode to be a big tech executive, you somehow have all the power to make every decision while also having the ability to never take the fault for these decisions.
    
    scarface_74 9 months ago
    
    I would much rather have a company with a culture that isn’t afraid to take calculated risks and not be afraid of repercussions when they take risk as long as it doesn’t cause consumer harm.
    
    azemetre 9 months ago
    
    "Not doing consumer harm" is carrying a lot of weight there.
    Either way what you describe is perfectly achievable for the workers, but at some point management needs to own up to their failures and getting rewarded because the board is also made up of executives at other big tech companies is a perverse incentive to never actually improve.
    
    scarface_74 9 months ago
    
    How did Microsoft’s losing bets do consumer harm?
    
    azemetre 9 months ago
    
    I mean forcing copilot everywhere I don't want it (nowhere) while jacking up prices to justify it and using Windows 11 to serve ads is harmful to me. There's also you know... the anticompetitive company that thinks buying new sectors is healthy.
    
    scarface_74 9 months ago
    
    Today, Microsoft’s revenue mostly comes from Office and Azure. All except PowerPoint were written and designed by MS.
    
    azemetre 9 months ago
    
    What does that have to do with what I'm saying? Today Copilot is being shoved in services I don't want which they are then in turn using to justify cost increases.
    How is that not consumer harm?
    Hopefully in the future the FTC will break up Microsoft, forcing them to split Azure, Office, and Windows. They clearly can't be trusted with all 3.
- cft 9 months ago
  
  Since the time when companies en masse stopped paying cash dividends on owned shares, the value has become highly speculative. In the absence of dividend payments, the stock pricing mechanism is not essentially different from Solana or Ethereum "price" discovery.
  - toomuchtodo 9 months ago
    
    I don't disagree that price discovery is harder, but I can with more certainty give an honest valuation of CLF or DOW vs OpenAI's "who knows what money will look like after we succeed, you should view your investment as a donation" nonsense. Speculation is inevitable when forward looking, but there is a difference between error bars and various projections vs unicorns.
    Due diligence never goes out of style.
  - JumpCrisscross 9 months ago
    
    > when companies en masse stopped paying stock dividends
    Do you mean cash dividends [1]?
    Also, the premise is false. Dividend yields have roughly tracked interest rates [2]. (The difference is a dirty component of the equity risk premium [3].)
    [1] https://www.investopedia.com/ask/answers/05/stockcashdividen...
    [2] https://www.multpl.com/s-p-500-dividend-yield/table/by-year
    [3] https://www.investopedia.com/investing/calculating-equity-ri...
    
    cft 9 months ago
    
    I changed the typo, thanks. Chash dividends. This analysis does not negate common sense: when a company does not pay cash dividends, owning its stock is purely speculative, like owning Solana. When it does, you get cash dividends funded by the company's tangible revenue, proportional to your number of shares.
Buttons840 9 months ago

I saw a some Europeans hoping that the US would ban DeepSeek, because then there would be less traffic interfering with their own DeepSeek queries.
The US can ban all they want, but if the rest of the world starts preferring Chinese social media, Chinese AI, and Chinese websites in general, the US is going to lose one of its crown jewels.
The way the US behaves is a problem and makes a lot of people prefer alternatives just for the sake of avoiding the US, which is why it's important that the US get along with other nations, but--well, about that...
- nozzlegear 9 months ago
  
  Agreed, you've highlighted one of the key problems with protectionism and nativism. Banning competition just weakens America's global influence, it doesn't make it stronger.
  - darkwizard42 9 months ago
    
    This statement doesn't seem to hold true. China has banned nearly all US tech companies and social products. It has not decreased the influence of China's influence (which has been through manufacturing/retail influence and tech influence).
    I don't think your statement holds with current behavior.
    
    kergonath 9 months ago
    
    > China has banned nearly all US tech companies and social products. It has not decreased the influence of China's influence
    Being hostile does not bring you friends. Sure, various countries can have reasons to suck it up anyway (e.g. because of sanctions, or because China makes an offer too good to pass, although even that comes with strings attached). But in the long run you just create clients or satellites who will escape at the first occasion.
    The American foreign policy around the middle of the 20th century relied very effectively on soft power, which is something you can leverage to get much more out of your investments than their pure monetary value. It is not required in order to gain influence, but it is a force multiplier.
    
    philistine 9 months ago
    
    Then how can you explain that China’s hostility towards Western tech companies being present inside their own country has not created what you’re describing?
    Is hostility a bad idea only for America? Sure hope not.
    
    nozzlegear 9 months ago
    
    > Is hostility a bad idea only for America? Sure hope not.
    I think protectionism is long-term bad for every country, but it's especially and uniquely bad for the biggest economy in the world who has net benefitted the most from free trade and competition. There's no denying that China is influential – the argument is that they could've been (and still can be) so much more influential by embracing western tech instead of walling themselves off.
    
    freeone3000 9 months ago
    
    America is reliant on purchasing cheap goods from elsewhere and selling expensive technology. If it’s hostile toward the suppliers of cheap goods or the buyers of expensive technology, well, what purpose does it have on the global scale?
    
    kergonath 9 months ago
    
    I am saying that they could have got much more, particularly considering the spectacular mistakes western countries kept making for the last ~2 decades.
    
    nozzlegear 9 months ago
    
    But China has never been a global leader in tech or social media. They undoubtedly have influence in these areas, but they've never dominated them like the US has. Banning foreign competition in a field where you already dominate, like tech and AI, has different consequences than banning it where you're playing catch up.
    
    jononor 9 months ago
    
    What is your definition of "tech"? A very large amount of the electronics products in the world are made in China (specifically in/around Szhenzen and the wider Guangdong province). Both consumer goods and industrial goods. From the cheapest stuff to the most advanced and everything in between. They provide the manufacturing for brands fron all over the world, including goods "from the west". The amount of economy that depends entirely on this low-cost, high-quality manufacturing is insanely large - both directly in electronics goods but also as part of many other industries because you need electronics to build anything else.
    
    nozzlegear 9 months ago
    
    By "tech" I'm sort of vaguely handwaving at Silicon Valley et al. I agree that China has built up a massive manufacturing industry that the west depends on, but I don't think that "being a significant cog in the machine," so to speak, buys as much influence or bargaining power as being the maker or owner of the machine. It's better to have the Apples and Googles of the world than it is to have the SG Micros or BYD Electronics.
    
    jononor 9 months ago
    
    American consumer and industrial electronics companies are increasingly unable to deliver products without the Chinese supply chain. How does that not give significant bargaining power? Also factor in that the Chinese manufacturers also manufacturers for everyone else in the world, so they don't have to sell that capacity to USA. And that the share of production capacity that companies from America use is trending down anyways. Mostly due to Asia, Middle East and South America are still growing a lot. Then Africa is following, delayed by some decades. Of course owning the end customer is generally better. But moving production is not something to take lightly.
    
    philistine 9 months ago
    
    TikTok has been the darling of the world for years at this point. They’re a global leader.
    
    nozzlegear 9 months ago
    
    Pretext my statement with "historically, until the last 5 years or so" and it still stands. TikTok is definitely influential, there's no arguing that.
  - paxys 9 months ago
    
    Plus technology cycles move so quickly that you won't have to wait a generation or two to see the effects of this isolationism.
  - sailfast 9 months ago
    
    You know this. I know this. But the President of the United States does not know this.
- tensor 9 months ago
  
  I've recently cancelled my Github Copilot subscription and now use Mistral. When the US starts threatening allies with tariffs or invasion, using US services becomes a major business risk.
  - mongol 9 months ago
    
    Not only a business risk. It also becomes a moral imperative to avoid if you can. Don't support bullies, is my motto. It can be hard to completely avoid, but it is important to try.
- karel-3d 9 months ago
  
  EU will ban DeepSeek sooner because of (lack of) GDPR compliance
  - whatevaa 9 months ago
    
    That's ok, they provide the models. They can be run at Europe, given you have capable hardware.
    
    wongarsu 9 months ago
    
    And the path to pleasing the EU would be straightforward: make a EU subsidiary, have it host or rent GPUs and servers in Europe, make sure personally-identifiable data is handled in accordance with GDPR and doesn't leave that subsidiary, make sure EU customers make their accounts with and are served by that subsidiary.
    Meanwhile, to please the US they would probably have to move the entire company to the US. And even that may not be enough
  - acheong08 9 months ago
    
    Banning the site would be fine. The model itself will still be available from a variety of providers as well as locally. The US is more likely to ban the model itself on the basis of national security
  - Buttons840 9 months ago
    
    Does EU block websites that don't comply with their laws?
    If DeepSeek becomes popular in America I predict it will be blocked, national firewall style. Will EU do the same?
    
    surgical_fire 9 months ago
    
    Generally no. For all people complaining about EU regulations, the regulators typically opt to fine companies into compliance.
onlyrealcuzzo 9 months ago

Theoretically this should be good for OpenAI - in that they can reduce their costs by ~27x and pass that along to end users to get more adoption and more profit.
- ceejayoz 9 months ago
  
  No; those costs were their moat.
  - whatshisface 9 months ago
    
    I wish more people had understood that spending a lot of money processing publicly available commodities with techniques available in the published literature is the business model of a steel mill.
    
    JumpCrisscross 9 months ago
    
    > is the business model of a steel mill
    It’s the business of commodities. The magic is in tiny incremental improvements and distribution. DeepSeek forces us to question if AI—possibly intelligence—is a commodity.
    
    camdenreslink 9 months ago
    
    I’m not sure I’d call what LLMs do intelligence. Not yet anyway…
    
    JumpCrisscross 9 months ago
    
    > not sure I’d call what LLMs do intelligence
    No, but it's good enough to replace some office jobs. Which forces us to ask, to what degree is intelligence--unique intelligence--required for useful production? (We can ask the same about physical strength.)
    
    FridgeSeal 9 months ago
    
    I find it interesting that so much discussion about “LLM’s can do some of our work” is centred around “are they intelligence” and not what I see as the precursor question of “are we doing a lot of bullshit work?”
    My partner is in law, along with several friends and the amount of completely _useless_ work and ceremony they’re forced to do is insane. It’s a literal waste of their talent and time. We could probably net most of the claimed AI gains by taking a serious look at pointless workloads and come out ahead due to not needing the energy and capital expenditure.
    
    sebzim4500 9 months ago
    
    Surely that would be amazing for NVDA? If the only 'hard' part of making AI is making/buying/smuggling the hardware then nvidia should expect to capture most of the value.
    
    JoshTko 9 months ago
    
    No. Before Deepseek R1, Nvidia was charging $100 for a $20 shovel in the gold rush. Now, every Fortune 100 can build an O1-level model with currently existing (and soon to be online) infra. Healthy demand for H100 and Blackwell will remain, but paying $100 for a $20 shovel is unlikely.
    Nvidia will definitely stay profitable for now though, as long as Deepseek’s breakthroughs are not further improved upon. But if others find additional compression gains, Nvidia won’t recapture its old premium. Its stock hinged on 80% margins and 75% annual growth, Deepseek broke that premise.
    
    wongarsu 9 months ago
    
    There still isn't a serious alternative for chips for AI training. Until competition catches up or models become so efficient they can be trained on gaming cards Nvidia will still be able to command the same margins.
    Growth might take a short-term dip, but may well be picked up by induced demand. Being able to train your own models "cheaply" will cause a lot more companies and departments want to train their own models on their own data, and cause them to retrain more frequently.
    The time of being able to sell H100 clusters for inference might be coming to an end though.
    
    JumpCrisscross 9 months ago
    
    > that would be amazing for NVDA?
    It’s good for Nvidia. It’s not as good as it was before. (Assuming DeepSeek’s claims are replicable.)
    
    ceejayoz 9 months ago
    
    DeepSeek revealed it's not as hard as previously thought; a much smaller number of less sophisticated chips was sufficient.
    
    throwaway48476 9 months ago
    
    NVDA is too invested in training and underinvested in edge inference.
  - cjbgkagh 9 months ago
    
    But have you heard of Jevons Paradon…… /s
    OMG, it seems tech has been invaded by baaing crypto bros
  - scotty79 9 months ago
    
    Maybe they even suppressed algorithmic improvements in their company to preserve moat. Something akin to Kodak suppressing internal research on digital cameras because they were world leading company that produced photo film.
  - onlyrealcuzzo 9 months ago
    
    You don't need a moat when you're in first place.
    Their moat is >1B people are already using ChatGPT monthly.
    They aren't going to switch unless something is substantially better.
    
    like_any_other 9 months ago
    
    > Their moat is >1B people are already using ChatGPT monthly.
    Unlike a social network, network effects won't help them - their users don't care how many other users they have, only about the AI output quality.
    > They aren't going to switch unless something is substantially better.
    Or approximately as good but cheaper.
    
    onlyrealcuzzo 9 months ago
    
    > Or approximately as good but cheaper.
    You're fooling yourself if you think OpenAI is going to pass up implementing the same strategies to get a ~27x cheaper model.
    > Unlike a social network, network effects won't help them - their users don't care how many other users they have, only about the AI output quality.
    Google Search doesn't have a network effect. Everyone on HN has been saying Google Search is complete garbage for a decade. It still has the same market share (roughly) as it did a decade ago.
    
    like_any_other 9 months ago
    
    > You're fooling yourself if you think OpenAI is going to pass up implementing the same strategies to get a ~27x cheaper model.
    But that would mean a 27x lower valuation.
    
    JumpCrisscross 9 months ago
    
    > that would mean a 27x lower valuation
    Not directly. The 27x is about costs. What it means is some order of magnitude of more competition. That reduces natural market share, price leverage and thus future profits.
    
    onlyrealcuzzo 9 months ago
    
    > But that would mean a 27x lower valuation.
    No.
    Valuations are based on future profits. Not future revenues.
    You can theoretically lower your costs by 27x and end up with 2x more future profits - if you're actually 45x cheaper (which DeepSeek's method claims to be).
    
    like_any_other 9 months ago
    
    You mean charge a 27x lower price, but have 45x lower costs, so your profit margin has doubled?
    Your relative margin may have doubled, but your absolute profit-per-item hasn't. Say you had a 10% margin before, at a $100 price and $90 cost, for a $10 profit-per-item. Reduce price 27x and cost 45x, so $3.7 price, $2 cost, and $1.7 profit-per-item. 6x less profit - not as bad as 27x, but not good if you're OpenAI.
    
    onlyrealcuzzo 9 months ago
    
    > Your relative margin may have doubled, but your absolute profit-per-item hasn't.
    ChatGPT doesn't have any profits right now.
    We have no idea what investors are expecting future profits to be.
    > Say you had a 10% margin before, at a $100 price and $90 cost, for a $10 profit-per-item. Reduce price 27x and cost 45x, so $3.7 price, $2 cost, and $1.7 profit-per-item. 6x less profit - not as bad as 27x, but not good if you're OpenAI.
    Now do the same thing but assume you have 10x more subscribers because the prices are ~27x lower.
    You end up with almost 2x more total profit.
    Just take ChatGPT's ~$200 subscription. Hardly anyone is going to pay ~$200 a month. Reduce that by 27x - and you're at $7.5 per month. Maybe 10% of people on the planet will pay that.
    
    ceejayoz 9 months ago
    
    > Now do the same thing but assume you have 10x more subscribers because the prices are ~27x lower.
    You're in various spots of this thread pushing the idea that their 1B MAUs make them unassailable. How are they gonna get to 10B in a world with less than that total people?
    > Just take ChatGPT's ~$200 subscription. Hardly anyone is going to pay ~$200 a month. Reduce that by 27x - and you're at $7.5 per month. Maybe 10% of people on the planet will pay that.
    They can't even make money at the $200 price point, though. https://x.com/sama/status/1876104315296968813
    
    hadlock 9 months ago
    
    if ChatGPT starts selling ads on chat results that will probably improve revenue. I've seen social media ads recently for things I've only typed into ChatGPT so that leads me to believe they're already monetizing it to advertising platforms.
    
    ceejayoz 9 months ago
    
    > Valuations are based on future profits.
    Which are estimated, in significant part, by the chance of a competitor arising.
    If the barriers of entry are much lower than originally thought, the potential profit margin plummets.
    
    rurp 9 months ago
    
    Google spends immense amounts of resources every year to ensure that their search is almost always the default option. Defaults are extremely powerful in consumer tech.
    
    digitalPhonix 9 months ago
    
    > Google Search doesn't have a network effect. Everyone on HN has been saying Google Search is complete garbage for a decade. It still has the same market share (roughly) as it did a decade ago.
    It absolutely does. People use Google for search -> Websites optimise for Google -> People get “better” results when searching with Google.
    The fact that it’s market share is sticky and not responding quickly to change in quality is sort of indicative of the network effect.
    
    gtirloni 9 months ago
    
    Other search engines don't have a gigantic advertising budget or a dominant browser pounding on users' heads to use them.
    
    jasonjmcghee 9 months ago
    
    1 billion MAU? What's the source on that? Very difficult to believe.
    
    ceejayoz 9 months ago
    
    It probably counts pretty much anyone on a newer iPhone/Mac (https://support.apple.com/en-au/guide/iphone/iph00fd3c8c2/io...) and Windows/Bing. Plus all the smaller integrations out there. All of which can be migrated to a new LLM vendor... pretty quickly.
    I wonder what the direct user counts are.
    
    kgwgk 9 months ago
    
    https://archive.is/c6cn9
    « The thing I noticed right away when Claude came out is how little lock-in ChatGPT had established. This was very different to my experience when I first ran a search on Google, sometime in the year 2000. After the first time I used Google, I literally never used another search engine again; it was just light years ahead of its competitors in terms of the quality of its results, and the clarity of its presentation. This week I added a third chatbot to the mix: DeepSeek »
    Follow up: https://x.com/TheStalwart/status/1884606421225848889
    
    mohsen1 9 months ago
    
    My guess is that OS vendors are the real winners in the long run. If Siri/Goolge can access my stuff and core of LLMs is this replicable then I don't see anyone downloading any apps for their typical AI usage. Specially that users have to go out of their way to allow a 3rd party to access all their data.
    This is why OpenAI is so deep in the product development phase right now. They have to become the OS to be successful but I don't see that happening
    
    ceejayoz 9 months ago
    
    > You don't need a moat when you're in first place.
    Tell that to Friendster/MySpace and Facebook.
    
    kruxigt 9 months ago
    
    [dead]
    
    onlyrealcuzzo 9 months ago
    
    Cute - but MySpace didn't have >1B users, it didn't even have 10M when Facebook launched.
    Try again.
    
    ceejayoz 9 months ago
    
    Nothing had a billion users; the Internet didn't at the time.
    MySpace and Friendster both spent significant time as the #1 social sites. Facebook unseated them rapidly. The same is possible for OpenAI.
    
    onlyrealcuzzo 9 months ago
    
    [flagged]
    
    ceejayoz 9 months ago
    
    MySpace and Friendster both claimed ~115M peak users.
    > It's literally orders of magnitude.
    Sure, and the speed at which ChatGPT went from zero to a billion is precisely why they need a moat... because otherwise the next one can do it to them.
    Your argument is like a railroad company in 1905 scoffing at the idea that airliners will be a thing.
    
    onlyrealcuzzo 9 months ago
    
    Peek users is not the amount of users they had when Facebook started.
    Facebook probably would've never became a thing if MySpace already had ~115M users when it started.
    MySpace had ~1M.
    That's why DeepSeek (or anyone else) is going to have an incredibly difficult time convincing ~1B to switch from ChatGPT to their tool instead.
    Can it happen? For sure.
    Will it happen? Less likely.
    If anyone unseats ChatGPT - it's much more likely to be a usual suspect like Google, Apple, or Microsoft - then some obscure company no one has ever heard of.
    Of course, anything is possible.
    
    cjbgkagh 9 months ago
    
    It had more than Facebook when Facebook launched so I’m not sure what your point is
    
    shmeeed 9 months ago
    
    Dude, if you seriously believe OpenAI has 1B active users, you should go touch grass. Actual estimates are 100-200 million, about a magnitude lower.
    
    Sateeshm 9 months ago
    
    There is no network effect (amazon, instagram, etc.) not an enterprise vendor lock-in (Microsoft Office/AD, Apple Appstore, etc.) In fact, it's quite the opposite, the way these companies deliver ouput is damn near identical. Switching between them is pretty painless.
    
    JumpCrisscross 9 months ago
    
    > You don't need a moat when you're in first place
    There are different moats [1]. You’re describing incumbency, an intangible moat. It’s nice, but it’s fickle. Particularly with something with low switching costs.
    OpenAI could argue, before, that it had a natural monopoly. More people use OpenAI so it gets more revenue and more data which lets it raise more capital to train these expensive models. That may not be true, which means it only has that first, shallow moat. It’s Nike. Not Google.
    [1] https://en.m.wikipedia.org/wiki/Economic_moat
    
    onlyrealcuzzo 9 months ago
    
    > There are different moats [1]. You’re describing incumbency, an intangible moat. It’s nice, but it’s fickle. Particularly with something with low switching costs.
    Google has a low switching cost, and hardly anyone switches.
    ChatGPT is quite similar to Google in this way.
    
    JumpCrisscross 9 months ago
    
    > Google has a low switching cost, and hardly anyone switches
    Google has massive network effects on its ad business and a natural monopoly on its search index. Crawling the web is expensive. It’s why Kagi has to pay Google (versus being able to pay them once and then stop).
    
    scarface_74 9 months ago
    
    Thought experiment: if tomorrow Apple changed the default search engine from Google to ChatGPT for iOS, how fast would Google’s dominance drop?
    iOS has 70% market share in the US
    
    flavius29663 9 months ago
    
    just for the chatbot, it's trivial to switch, create a new account and start asking questions from deepseek instead. There is nothing holding the users in chatgpt.
    
    ceejayoz 9 months ago
    
    And the bigger risk is the big companies making deals - like Apple including ChatGPT access in iOS - canceling those to do it on-device or in-house.
    1B MAUs doesn't look great if half of them come from one source that can easily change to a competitor.
    
    lm28469 9 months ago
    
    > They aren't going to switch unless something is substantially better.
    Except one product is 100% free and the other is mostly locked behind paid subscriptions
    
    scarface_74 9 months ago
    
    How long can DeepSeek stay free?
    It’s already unable to keep up with demand, it will never be the default on mobile devices and businesses in the US will never trust it.
    
    ceejayoz 9 months ago
    
    That's not really the important question.
    The important question is "will this and similar optimizations to come permit local LLM use, cutting OpenAI out of the equation entirely?"
    
    scarface_74 9 months ago
    
    Businesses don’t even want to maintain servers locally. They definitely aren’t going to start managing servers beefy enough to run LLMs and try to run then with the reliability, availability, etc of cloud services.
    This will make the cloud providers - especially AWS, GCP and to a lesser extent the also ran clouds more valuable. The other models hosted by AWS on Bedrock are already “good enough” for most business use cases.
    And then consumers are definitely not going to be running LLMs locally on their computers to replicate ChatGPT (the product) anymore than they are going to get an FTP account, mount it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem and then from Windows or Mac, accessed the FTP account through built-in software instead of using cloud storage like Dropbox. [1]
    Whether someone comes up with a better product than ChatGPT and overcome the brand awareness is yet to be seen.
    [1] Also the iPod had no wireless, less space than the Nomad and was lame.
    
    ceejayoz 9 months ago
    
    > And then consumers are definitely not going to be running LLMs locally on their computers to replicate ChatGPT...
    Not personally. They'll let Apple handle it for them.
    (This is already a thing. https://machinelearning.apple.com/research/introducing-apple...)
    
    scarface_74 9 months ago
    
    There is a reason I kept emphasizing the ChatGPT product. The (paid) ChatGPT product is not just a text based LLM. It can interpret images, has a built in Python runtime to offload queries that LLMs aren’t good at like math, web search, image generation, and a couple of other integrations.
    The local LLM on iPhones are literally 1% as powerful as the server based models like 4o.
    That’s not even considering battery considerations
    
    ceejayoz 9 months ago
    
    > The local LLM on iPhones are literally 1% as powerful as the server based models like 4o.
    Currently, yes. That's why this is a compelling advance - it makes local LLMs much more feasible, especially if this is just the first of many breakthroughs.
    A lot of the hype around OpenAI has been due to the fact that buying enough capacity to run these things wasn't all that feasible for competitors. Now, it is, potentially even at the local level.
    
    spinlock_ 9 months ago
    
    I don't agree. You don't have moat if you are offering the same quality for a higher price.
    
    cogman10 9 months ago
    
    Free that runs locally on consumer hardware sounds substantially better.
    
    sandclock 9 months ago
    
    That is exactly what a moat is. Keeping others out.
    moat noun a deep, wide ditch surrounding a castle, fort, or town, typically filled with water and intended as a defense against attack.
    
    ceejayoz 9 months ago
    
    Precisely. First place needs the moat.
    Second place just needs a catapult and a diseased cow.
    
    meiraleal 9 months ago
    
    oh well, I switched yesterday from a paid plan to a free one and I'm quite happy with the quality improvement.
- blantonl 9 months ago
  
  Nah, those costs were for their doomsday bunkers and crypto purchases, and maybe a house or 3
- mirzap 9 months ago
  
  Training costs are not the same as inference costs. DeepSeek (or anyone hosting DS largest model) will still need a lot of money and a bunch of GPU clusters to serve the customers.
- askl 9 months ago
  
  They could pivot to being a wrapper around DeepSeek. That would also save a lot of R&D costs.
- btbuildem 9 months ago
  
  > pass that along to end users
  I don't think that's at all likely in the current economic system
- wturner 9 months ago
  
  Capitalism - a system where rational actors make informed decisions.
dkjaudyeqooe 9 months ago

Not sure I agree with your premise, but what exactly are they going to ban?
They can stop DeepSeek doing various things commercially I guess, but stopping Americans using their ideas is simply impossible and stopping use of their source or weights would be (likely successfully) challenged under the first amendment.
There is no law against simply destroying trillions of dollars of shareholder value.
- thedevilslawyer 9 months ago
  
  Heh, not yet..
fullshark 9 months ago

Would it even matter? Isn't the cat out of the bag and everything they did repeatable by an American research team?
- whatshisface 9 months ago
  
  American researchers had already made enough progress to prove that LLMs were not an incomprehensible trade secret based on years of secret knowledge - investors and tech executives were simply lead to believe otherwise. Well-connected people are probably very mad about this and they may try to lash out like the emotional human beings they are.
- bhouston 9 months ago
  
  It doesn't matter from the US government perspective if all of the tech is replicated by US companies and US user continue to use US AI technology. But if US users start to use Chinese AI tech, then protectionism urges will appear that will likely figure out how to ban its use or subject it to large tariffs (e.g. TikTok, BYD, network equipment, solar panels, etc.)
- Cumpiler69 9 months ago
  
  It matters because their goal was hyping up how advanced and difficult their tech is, propping up their valuations.
  DeepSeek proved the emperor had no clothes and wiped out a lot of their valuation when investors saw reaching parity to Chtgpt is not really that difficult.
  - mastazi 9 months ago
    
    I think parent was asking would it even matter if there was a ban. To which the answer would be "no" because as you said the point has been made. And, as parent pointed out, it's repeatable anyway.
dtquad 9 months ago

>It's reasonably likely that a lot of people linked to the federal government want to ban DeepSeek.
It took them years and years to move forward with the ban ok Tiktok and it still hasn't been banned yet. There is no way they are going to ban some MIT-licensed weights.
>"they destroyed $1T of shareholder value."
The market has largely recovered.
- tokioyoyo 9 months ago
  
  There is big American money invested in TikTok. That doesn’t seem to be the case for DeepSeek.
aprilthird2021 9 months ago

Banning it will not bring back the value
nullbyte 9 months ago

I think the real concern from the govt's perspective is data privacy, since all the chat messages are stored on Chinese servers
the_sleaze_ 9 months ago

"easy to reinvent" often comes after "hard to invent"
- mromanuk 9 months ago
  
  At Microsoft’s size, they don’t care; they just buy out others.
leesec 9 months ago

It was so easy it cost hundreds of millions of dollars, only one company has done it and they had to lie about it
dfxm12 9 months ago

There's a lot of egg on people's faces now. DeepSeek shows there's nothing special about America or its economic system that breeds innovation. DeepSeek shows how these tech oligarchs greatly overplayed their hand and along with the president, bamboozled the taxpayer to enrich each other. I just hope the voters remember this in 2 years, 4 years and beyond.
- iforgot22 9 months ago
  
  If the allegations are true, the special thing about OpenAI is that it didn't have to be trained off DeepSeek. But either way, you maybe don't want to invest billions in something if someone else will be able to copy it for less.
- dtquad 9 months ago
  
  >DeepSeek shows how these tech oligarchs greatly overplayed their hand and along with the president, bamboozled the taxpayer to enrich each other.
  How much taxpayer money has gone to OpenAI and Anthropic? They are the two big sinners in closed AI.
IAmGraydon 9 months ago

Yeah they're setting this up to ban it. Crazy that they think this kind of approach will work in any way. Banning H100s didn't work, and actually pushed them to innovate. Now someone has found a more efficient way to train a model and they decide the best way forward is for the US not to benefit from access to it? This is clear evidence of collusion between OpenAI and the US Government to disadvantage competitors. Beyond that, it will never work. If they need to be reminded of just how little power they have to control the distribution of open source models, I think we would all be happy to enlighten them.
duxup 9 months ago

When it comes to the executive branch's role in banning something. I'm not convinced they're even honest about it / what the context even is.
Trump wanted to ban Tiktok before... and then simply chose not to / forgot about it.
Next round congress acted, and Trump delayed it and has said that he is interested in his friends buying it.
Is there really a competitive plan here or is it just fishing for payouts / grifting for allies?
The context is always about competition, but I'm not even sure that's their plan.
Cumpiler69 9 months ago

>By revealing that Microsoft et al. paid way too much to OpenAI et al. for technology that was actually easy to reinvent.
That's why it's called a bubble. Pretty sure my great great grandad also overpaid for some tulips.

cumulative00x 9 months ago

There is a saying in Turkish that roughly goes like this, it takes a thief to catch a thief. I am not a big fan of China's tech, too, however, it amuses me to watch how big tech charlatans have been crying over Deepseek shock.

gosub100 9 months ago

It's true irony to see thieves getting stolen from.

bicepjai 9 months ago

Reading this post, I can’t help but wonder if people realize the irony in what they’re saying. 1. “The issue is when you [take it out of the platform and] are doing it to create your own model for your own purposes,” 2. “There’s a technique in AI called distillation . . . when one model learns from another model [and] kind of sucks the knowledge out of the parent model,”

palisade 9 months ago

Is this really the point OpenAI wants to start debating? When OpenAI steals everyone's data, it is fine. Right? But, let us pull the ladder up after that.

nba456_ 9 months ago

A big part of project 2025 is increasing patent regulations. I would not be surprised if the current admin moves to ban DeepSeek because of this.

ForHackernews 9 months ago

What's good for the goose is good for the gander. Obviously a transformative work and not an intellectual property violation any more than OpenAI injesting every piece of media in existence.

dagelf 9 months ago

Injesting is sure the right take. What a circus!

hyperbovine 9 months ago

Live by the sword...

cratermoon 9 months ago

Ironic, OpenAI claiming someone else stole their work.

vinni2 9 months ago

How would they prove they used it’s model. I would be curious to know their methodology. Also what legal actions OpenAI can take? can DeepSeek be banned in US?

paul_e_warner 9 months ago

If you read the article (which I know no one does anymore)
>OpenAI and its partner Microsoft investigated accounts believed to be DeepSeek’s last year that were using OpenAI’s application programming interface (API) and blocked their access on suspicion of distillation that violated the terms of service, another person with direct knowledge said. These investigations were first reported by Bloomberg.
iforgot22 9 months ago

They might show DeepSeek's model calling itself ChatGPT, which users have already alleged. Same as how Cisco proved Huawei was stealing router code.
Except in this case, nothing was stolen, unless they want to call ChatGPT's own training on source data theft too.
- freehorse 9 months ago
  
  ChatGPT outputs are all over the internet. It is harder to prove that deepseek used specifically o1 for training, instead of a lot of chatgpt output ending up in the training set from other sources.
  - iforgot22 9 months ago
    
    That's a good point, at least for the prompts I saw. Like "do you have an app I can use" is commonly seen with "here's the ChatGPT app" online. And maybe they don't add anything telling Deepseek that it's Deepseek.

Kiro 9 months ago

> Furious [...] shocked

I'm not seeing it. I get it, the narrative that OpenAI is getting a taste of their own medicine is funny but this is not serious reporting.

Kiro 9 months ago

The link has been changed. My comment was about a different article that speculated on what OpenAI was "feeling" using hyperbole.

guybedo 9 months ago

This is hilarious.

Everybody has evidence OpenAI scraped the internet at a global scale and used terabytes of data it didn't pay for. Newspapers, books, etc...

sabhiram 9 months ago

The grapes are sour because their moat is crumbling.

What was supposed to be a model, training, and data moat - is now reduced to operational cost, which they are not terribly efficient for.

OpenAI has been on a journey to burn as much $ as possible to get as far ahead on those three moats, to the point where decreasing TCO for them on inference was not even relevant - "who cares if you save me 20% of costs when I can raise on a 150b pre money value?".

Well, with their moats disappearing, they will have no choice but to compete on inference cost like everyone else.

cbracketdash 9 months ago

Let's also not forget Suchir Balaji, who was mysteriously killed when exposing OpenAI's violation of copyright law.

JBSay 9 months ago

When China is more open than you, you've got a problem

jchook 9 months ago

Friendly reminder that China publishes twice as many AI papers as the US[1], and twice as many science and engineering papers as the US.

China leads the world in the most cited papers[2]. The US's share of the top 1% highly cited articles (HCA) has declined significantly since 2016 (1.91 to 1.66%), and the same has doubled in China since 2011 (0.66 to 1.28%)[3].

China also leads the world in the number of generative AI patents[4].

1. https://www.bfna.org/digital-world/infographic-ai-research-a...

2. https://www.science.org/content/article/china-rises-first-pl...

3. https://ncses.nsf.gov/pubs/nsb202333/impact-of-published-res...

4. https://www.wipo.int/web-publications/patent-landscape-repor...

waffletower 9 months ago

"Stole" - I don't believe that word means what he thinks it means. Perhaps I pre-maturely anthropomorphize AI -- yet when I read a novel, such as The Sorcerer's Stone, I am not guilty of stealing Rowling's work, even if I didn't purchase the book but instead found it and read it in a friend's bathroom. Now if I were to take the specific plot and characters of that story and write a screenplay or novel directly based on it, and, explicitly, attempt to sell this work, perhaps the verb chosen here would be appropriate.

janalsncm 9 months ago

I think readers should note that the article did not provide any evidence for OpenAI’s claims, only OpenAI declining to provide evidence, various people repeating the claim, others reacting to it.

It does matter whether it happened and how much it happened. Deepseek ran head to head comparisons against O1 so it would be pretty reasonable for them to have made API calls, for example.

But also, as the article notes, distillation, supervised fine tuning, and using LLM as a judge are all common techniques in research, which OpenAI knows very well.

hugoromano 9 months ago

OpenAI initially scraped the web and later formed partnerships to train on licensed data. Now, they claim that DeepSeek was trained on their models. However, DeepSeek couldn't use these models for free and had to pay API fees to OpenAI. From a legal standpoint, this could be seen as a violation of the terms and conditions. While I may be mistaken, it's unclear how DeepSeek could have trained their models without compensating OpenAI. Basically, OpenAI is saying machines can't learn from their outputs as humans do.

conartist6 9 months ago

It seems to be undermined by the same principle that says that going into a library and reading a book there is not stealing when you walk out with the knowledge from the book.

OpenAI seems to feel that way about the their use of copyrighted material: since they didn't literally make a copy of the source material, it's totally fair game. It seems like this is the same argument that protects DeepSeek if indeed they did this. And why not, reading a lot of books from the library is a way to get smarter, and ostensibly the point of libraries

adam_arthur 9 months ago

Who cares?

They did the exact same thing with public information. Their model just synthesizes and puts out the same information in a slightly different form.

Next we should sue students for repeating the words of their teachers

WD-42 9 months ago

Information wants to be free! No, not like that!

jongjong 9 months ago

If the material which OpenAI is trained on is itself not subject to copyright protections, then other LLMs trained on OpenAI should also not be subject to any copyright restrictions.

You can't have both ways... If OpenAI wants to claim that the AI is not repeating content but 'synthesizing it' in the same was as a human student would do... Then I think the same logic should extend to DeepSeek.

Now if OpenAI wants to claim that its own output is in fact copyright-protected, then it seems like it should owe royalty payments to everyone whose content was sourced upstream to build its own training set. Also, synthetic content which is derived from real content should also be factored in.

TBH, this could make a strong case for taxing AI. Like some kind of fee for human knowledge and distributed as UBI. The training data played a key part in this AI innovation.

As an open source coder, I know that my copyrighted code is being used by AI to help other people produce derived code and, by adapting it in this way, it's making my own code less relevant to some extent... In effect, it could be said that my code has been mixed in with the code of other open source developers and weaponized against us.

It feels like it could go either way TBH but there needs to be consistency.

xinayder 9 months ago

Well, this sounds like they are just crying because they are losing the race so far. Besides, DeepSeek explicitly states they did a study on distillation on ChatGPT, then OpenAI is like "oh see guys they used our models!!!!!"

khazhoux 9 months ago

By what metric are they losing?
- xinayder 9 months ago
  
  DeepSeek is a fraction of the cost of ChatGPT, they needed far few resources than OpenAI. This is essentially what caused the massive selloff in Nvidia, as a new competitor model is just as good and requires a fraction of the massive costs.
  I don't remember the correct metric but the cost for DeepSeek was like $15/mo while ChatGPT was $200
  - khazhoux 9 months ago
    
    You said "they're losing the race." They might lose, but I don't think we're seeing that yet. They undoubtedly gained a competitor over the weekend, but that didn't change their position as the leading AI company overnight.
    Correct me if my understanding is wrong, but if OpenAI's accusation is correct and DS is a derivative work, then isn't it inaccurate to say DS reached ChatGPT performance "at a fraction of the cost"? If true, seems like it's more accurate to say that they were able to copy an expensive model, at low expense.
    
    xinayder 9 months ago
    
    I agree in a way, but then in that case Gemini Claude and Qwen are all derivations of each other and shouldn't be in the competition either.
    DeepSeek did some studies on distillation, which might be what OpenAI is complaining about. But their bigger model is not a distilled version of OpenAI's.
    
    khazhoux 9 months ago
    
    To me the big question (which HN can't be bothered to discuss because SaM aLtMaN iS bAd) is whether DS shows that OpenAI can be done cheaply, or just copied cheaply. Your last sentence tells me you think DS built this from scratch. I haven't seen evidence of that.

highfrequency 9 months ago

If true, the question is: did they use ChatGPT outputs to create Deepseek V3 only, or is the R1-zero training process a complete lie (given that the whole premise is that they used pure reinforcement learning)? If they only used ChatGPT output when training V3, then they succeeded in basically replicating the jump from ChatGPT-4o to o1 without any human-labeled CoT (and published the results) - which is a big achievement on its own.

mbowcut2 9 months ago

So, is this just an example of the first-mover disadvantage (or maybe the problem of producing public goods?). The first AI models were orders of magnitude more expensive to create, but now that they're here we can, with techniques like distillation, replicate them at a fraction of the cost. I am not really literate in the law but weren't patents invented to solve problems like this?

amarcheschi 9 months ago

I quite like a scenery where llm output can't be copyrighted, so that it is possible to eventually train a llm with data from the previous one(s)

layer8 9 months ago

OpenAI argues it’s a violation of their terms of service. So there are legal issues if it can be proven.
- Palmik 9 months ago
  
  Legal issues for who?
  Company A pays OpenAI for their API. They use the API to generate or augment a lot of data. They own the data. They post the data on the open Internet.
  Company B has the habit of scraping various pages on the Internet to train its large language models, which includes the data posted by Company A. [1]
  OpenAI is undoubtedly breaking many terms of service and licenses when it uses most of the open Internet to train its models. Not to mention potential copyright violations (which do not apply to AI outputs).
  [1]: This is not hypothetical BTW. In the early days of LLMs, lots of large labs accidentally and not so accidentally trained on the now famous ShareGPT dataset (outputs from ChatGPT shared on the ShareGPT website).
  - layer8 9 months ago
    
    For both.
    
    Palmik 9 months ago
    
    Posting OpenAI generated data on the internet is not breaking the ToS. This is how most OpenAI based businesses operate, after all [1] (e.g. various businesses to generate articles with AI, various chat businesses that let you share your chats, etc.)
    OpenAI is one of the companies like Company B that is using data from the open Internet.
    [1] Ownership of content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.
- mannewalis 9 months ago
  
  But OpenAI's model isn't open source, how would they distill knowledge without direct access to the model?
  - layer8 9 months ago
    
    You don’t need direct access for LLM distillation, just regular API access.
    
    mannewalis 9 months ago
    
    ok I looked it up and have a better understanding now.

elashri 9 months ago

There is an Egyptian say that would translate to something like

"We didn’t see them when they were stealing, we saw them when they were fighting over what was stolen"

That describes this situation. Although to be honest all this aggressive scraping is noticeable but for people who understand that which is not majority of people. but now everyone knows.

waveBidder 9 months ago

> Although to be honest all this aggressive scraping is noticeable but for people who understand that which is not majority of people.
When you say noticeable, do you mean in like, traffic statistics? Or in what the model knows that it clearly shouldn't if it wasn't trained in legally dubious ways?
sadjad 9 months ago

"When two thieves quarrel, what was stolen emerges."
meiraleal 9 months ago

"We didn’t see them when we were stealing, we saw them when they were fighting over what we stole"
fixed for you
- nicce 9 months ago
  
  That means a different thing.

827a 9 months ago

This smells very suspiciously like: someone who doesn't know anything about AI (possibly Sacks) demanding answers on R1 from someone who doesn't have any good ones (possibly Altman). "Uh, (sweating), umm, (shaking), they stole it from us! Yeah, look at this suspicious activity, that's why they had it so easy, we did all the hard work first!"

fundad 9 months ago

I think it's funny that OpenAI wants us to pay them to use their product to generate content but then sets the terms that they control how we use the content in generates for us. It takes someone like Deepseek to challenge that on our behalf or they will control most of the economy.
- exitb 9 months ago
  
  It’s quite ironic of them to claim that the only thing you cannot train on is another LLM output.

ceejayoz 9 months ago

"You can't take data without asking" seems like a court precedent OpenAI really, really, really wants to avoid. And yet...

amelius 9 months ago

Why? When did large companies care about laws? See e.g. Uber, AirBnb.
The only thing government cares about at this point is if information is shared with China.
- ceejayoz 9 months ago
  
  They care when they get big enough to attract attention from people like state AGs who can actually put the hurt on a bit. Uber and AirBnB both hit this point years ago; OpenAI's starting to hit it.
  - galleywest200 9 months ago
    
    Altman is part of that Stargate Trump group now. He and his ilk will just get pardons.
    Curious, though, can a corporation be pardoned?
    
    ceejayoz 9 months ago
    
    The President can only pardon Federal crimes.
    State-level crimes (like his NY felonies) and civil torts (like his case where he owes $500M currently) are separate.
    
    actionfromafar 9 months ago
    
    Yet. Give it some time.
    
    ceejayoz 9 months ago
    
    Sure, but in that scenario, it's a bit like the Last of Us characters being concerned about electrical meter readings. We'll have much bigger problems.
layer8 9 months ago

OpenAI is saying that their service was used in violation of their TOS, which is a bit different than just copying data. To be clear I’m not on OpenAI’s side, but it looks to me that the legal situation isn’t exactly analogous.
- orlp 9 months ago
  
  If using data violating some ToS taints the model trained on that data, then all of OpenAI's models are tainted by the millions of ToS'es they broke.
  - orionsbelt 9 months ago
    
    Can you cite a source showing they violated ToS?
    
    hdjjhhvvhga 9 months ago
    
    Not just violated but also actively ignored: https://news.ycombinator.com/item?id=42718850
- kavalg 9 months ago
  
  As others have noted, if one company agrees to the ToS, asks "the right" questions and then publishes the ChatGPT answers, there is not violation of ToS. Then a second company scrapes the published Q&A, along with other information from the internet and again there is no violation (not more than the violations of OpenAI).
- hdjjhhvvhga 9 months ago
  
  > OpenAI is saying that their service was used in violation of their TOS
  Which is the most ridiculous argument they could use because they didn't respect any ToS (or copyright laws, for that matter) when scraping the whole web, books from Libgen and who knows what more.
- dkjaudyeqooe 9 months ago
  
  But whats the remedy in that case? Being banned from the service maybe, but no court is going to force a "return" of the data, so DeepSeek can't use it. It's uncopyrightable.
- DebtDeflation 9 months ago
  
  Tons of websites and books they scraped had copyright notices.
  - layer8 9 months ago
    
    Copyright and terms of service are different legal notions.
    
    Maxion 9 months ago
    
    Yeah, copyright means something and a ToS is virtual toiletpaper (at least in the EU)
    
    layer8 9 months ago
    
    This wasn’t about which is worse than the other, but about whether OpenAI would want to avoid court precedent for the one because of the other.

ranger_danger 9 months ago

Reminds me of Steve Jobs complaining to Bill Gates about MS "stealing" the GUI concept from them, which they in turn had stolen from Xerox.

Obligatory "Everything is a Remix" https://www.youtube.com/watch?v=X9RYuvPCQUA

pama 9 months ago

The R1 paper used o1-mini and o1-1217 in their comparisons, so I imagine they needed to use lots of OpenAI compute in December and January to evaluate their benchmarks in the same way as the rest of their pipeline. They show that distilling to smaller models works wonders, but you need the thought traces, which o1 does not provide. My best guess is that these types of news are just noise.

[edit: the above comment was based on sensetionalist reporting in the original link and not the current FT article. I still think there is a lot of noise in these news this last week, but it may well be that openai has valid evidence of wrongdoing; I would guess that any such wrongdoing would apply directly to V3 rather than R1-zero, because o1 does not provide traces and generating synthetic thinking data with 4o may be counterproductive.]

HarHarVeryFunny 9 months ago

DeepSeek-R1's multi-step bootstrapping process, starting with their DeepSeek-V3 base model, would only seem to need a small amount of reasoning data for the DeepSeek-R0 RL training, after which that becomes the source for further data, along with some other sources that they mention.

Of course it's possible that DeepSeek used O1 to generate some of this initial bootstrapping data, but not obvious. O1 anyways deliberately obfuscates it's reasoning process (see "Hiding the chains of thought" section of OpenAI's "Learning to reason with LLMs" page), such that what you see is an after-the-fact "summary" of what it actually did; so, if DeepSeek did indeed use some of O1's output to train on, it shows that the details of O1's own reasoning process isn't as important as they thought it was - it's just having some verified (i.e. leading to good outcome) reasoning data from any source that matters to get started.

paul_e_warner 9 months ago

There seem to be two kinda incompatible things in this article: 1. R1 is a distillation o1. This is against it's terms of service and possibly some form of IP theft. 2. R1 was leveraging GPT-4 to make it's output seem more human. This is very common and most universities and startups do it and it's impossible to prevent.

When you take both of these points and put them back to back, a natural answer seems to suggest itself which I'm not sure the authors intended to imply: R1 attempted to use o1 to make its answers seem more human, and as a result it accidentally picked up most of it's reasoning capabilities in the process. Is my reading totally off?

zhenghao1 9 months ago

All I see is sour grapes. Can't stand someone else coming up with a far more superior and cheaper alternative. This is business dude. There's always going to be some new disruptor to shake the market up.

ks2048 9 months ago

The schadenfreude and irony of this is totally understandable.

But, I wonder - do companies like OpenAI, Google, and Anthropic use each others models for training? If not, is it because they don't want to or need to, or because they are afraid of breaking the ToC?

oysmal 9 months ago

Given that the training approach was open sourced, their claim can be independently verified. Huggingface is currently doing that with Open R1, so hopefully we will get a concrete answer to whether these accusations are merited or not.

SubiculumCode 9 months ago

If you have a set of weights A, can you derive another set of weights B that function (near) identically as A AND a) not appear to be the same weights as A when inspected superficially b) appear uncorrelated when inspecting the weight matrices?

rahimnathwani 9 months ago

Do you mean for a given model structure, can two sets of weights give substantially the same outputs?
Even if that were possible, it would be suspicious if you were to release an open model whose model architecture is identical to that of a closed one from a competitor.
If that is what happened, we'd know about it by now.

staticelf 9 months ago

Not only do OpenAI and other steal data, they also spam the web with requests and crawl websites over and over.

https://pod.geraspora.de/posts/17342163

kelseydh 9 months ago

Wow I never realized how prolific and excessive the traffic was.

whoknowsidont 9 months ago

They can claim this all they want. But DeepSeek released the paper (several actually) on what they did, and it's already been replicated in other models.

It simply doesn't matter. Their methodology works.

ginkgotree 9 months ago

I did not have in my cards: PRC open sourcing most powerful LLM by stealing data set from "OpenAI" As someone that is very Pro-America and Pro-Democracy, the iron here is just... so sweet.

deeviant 9 months ago

Hmm, let’s see—it looks like an easy legal defense.

DeepSeek could simply admit, "Yep, oops, we did it," but argue that they only used the data to train Model X. So, if you want compensation, you can have all the revenue from Model X (which, conveniently, amounts to nothing).

Sure, they then used Model X to train Model Y, but would you really argue that the original copyright holders are entitled to all financial benefits derived from their work—especially when that benefit comes in the form of a model trained on their data without permission?

thih9 9 months ago

I don't mind and I believe that a company with "open" in its name shouldn't mind either.

I hope this is actually true and OpenAI loses its close to monopoly status. Having a for profit entity safeguarding a popular resource like this sounds miserable for everyone else.

At the moment AI looks like typical VC scheme: build something off someone else's work, sell it at cost at first, shove it down everyone's throats and when it's too late, hike the prices. I don't like that.

nataliste 9 months ago

A Wolf had stolen a Lamb and was carrying it off to his lair to eat it. But his plans were very much changed when he met a Lion, who, without making any excuses, took the Lamb away from him.

The Wolf made off to a safe distance, and then said in a much injured tone:

"You have no right to take my property like that!"

The Lion looked back, but as the Wolf was too far away to be taught a lesson without too much inconvenience, he said:

"Your property? Did you buy it, or did the Shepherd make you a gift of it? Pray tell me, how did you get it?"

What is evil won is evil lost.

JBits 9 months ago

What is the evidence that DeepSeek used OpenAI to train their model? Isn't this claim directly benefitting OpenAI as they can argue that any superior model requires their model?

caseyy 9 months ago

Seeing as OpenAI is on the back foot, I hope nationalistic politicians don’t use this opportunity to strengthen patent laws.

If one could effectively patent software inventions, this would kill many industries, from video games (that all have mechanics of other games in them) to computing in general (fast algorithms, etc). Let’s hope no one gets ideas like that…

Granted, it would be ineffective in competing against China’s tech industry. But less effective laws have been lobbied through in the past.

ripped_britches 9 months ago

There were definitely still very impressive engineering breakthroughs.

Also it’s pretty good confirmation that synthetic data is a valid answer to the data wall problem (non-problem).

SilverBirch 9 months ago

I think OpenAI is in a really weak position here. There are essentially two positions you can be in: You can be the agile new startup that can break the rules and move fast. That's what OpenAI used to be. Or you can be the big incumbent who is going to use your enormous resources to crush your opposition. That's Google & Microsoft here. For Microsoft to say "We're going to tie you up in lawsuits about the way you trained this model" would be perfectly expected and they can use that strategy because at any given time they have 1,000 lawyers and lobbyists hanging around waiting to do exactly that. But OpenAI can't do that. They don't have Google or Microsoft's legal teams or lobbyists or distribution channels. SO whilst it's funny that OpenAI are kind of trying to go down this road, this isn't actually a strategy that is going to work for them, they're still a minnow and they're going to get distracted and slowed down by this.

golly_ned 9 months ago

> they're still a minnow
3K+ employees, $3B+ revenue, ... sure, not BigTech but hardly a minnow. A company that big can chew gum and walk at the same time.
- lou1306 9 months ago
  
  They're trying to bark up a tree that might happen to be backed by the People's Republic of China. That's not their league, and even Microsoft would think twice before getting into that kind of kerfuffle.
  - dluan 9 months ago
    
    I think commenters don't know about Bill Gates personally wining and dining Hu Jintao in Medina 20 years ago.
- dauhak 9 months ago
  
  They're also still deep in their loss-making phase, the whole "incumbent squashing upstarts" stance is a lot easier to pull off when you're settled and printing money
- __loam 9 months ago
  
  $7-8b in costs so they're losing $5b
  - Pedro_Ribeiro 9 months ago
    
    Does that include R&D?
    If running ChatGPT costs $1B/y and they make $3B/y on it, tacking on the cost of R&D on top doesn't seem very fair.
    
    amoshebb 9 months ago
    
    why not? if their product is “SOTA LLM” then they can’t just lay off r&d and coast. Years from now nobody is going to be paying for “that authentic gpt4 sound”. LLMs aren’t guitars.
    
    __loam 9 months ago
    
    Afaik it includes inference which they seem to be offering at a loss.
jhanschoo 9 months ago

> For Microsoft to say "We're going to tie you up in lawsuits about the way you trained this model" would be perfectly expected and they can use that strategy because at any given time they have 1,000 lawyers and lobbyists hanging around waiting to do exactly that. But OpenAI can't do that. They don't have Google or Microsoft's legal teams or lobbyists or distribution channels.
I think it's also hilarious that suppose they can do that then they will end up suppressing innovation within the US, and eager groups in China would just innovate without having to worry about this hostile landscape.
htrp 9 months ago

But microsoft is one of their backers?
bagels 9 months ago

Anyone who uses DeepSeek commercially is now opening the door to litigation from them as well.

davesque 9 months ago

I recently thought of a related question. Actually, I'm almost certain that foundation model trainers have thought of this. The question is to what extent are popular modern benchmarks (or any reference to them, or description of them, etc.) bring scrubbed from the training data? Or are popular benchmarks designed in such a way that they can be re-parametrized for each run? In any case, it seems like a surprisingly hard problem to deal with.

zoba 9 months ago

Does OpenAI's API attempt to detect this sort of thing? Could they start outputting bad information if they suspect a distillation attempt is underway?

MagicMoonlight 9 months ago

So much for that walled garden. If rival firms can just download your entire model by talking to it then your company shouldn’t be worth billions.

DidYaWipe 9 months ago

They have "open" right in their name, so...

Objection overruled.

_hcuq 9 months ago

They should be happy. Now that can provide that amazing AI much more cheaply. They don't need half a trillion dollars worth of Nvidia chips.

kapad 9 months ago

Aah. So OpenAI can use whatever means necessary to gather data for training it's model. Regardless of copyright.

But somehow, it's a problem if another model developer distills it's model by training it on OpenAI?

IMO, if the first use is fair, then so is the second use.

henry_viii 9 months ago

So Meta can train its AI on all the pirated books in the world but people are losing their mind over an AI learning from another AI?

esafak 9 months ago

People here have been vocal against training on any unlicensed content.

mkayle 9 months ago

This raises the same questions I have about OpenAI: where's all this data coming from, and do they have permission to use it?

jasoneckert 9 months ago

What I find the most comical about this is that the whole situation could be loosely summarized as "OpenAI is losing its job to AI."

mattgreenrocks 9 months ago

OpenAI should be excited that it has been freed of the tedious tasks of building AI and now they can focus on higher level and more creative things.
- JoshTko 9 months ago
  
  I wish I could upvote this twice
  - munchler 9 months ago
    
    Soon you’ll be freed of the tedious task of upvoting at all.
  - dankwizard 9 months ago
    
    Reddit comment moment.
- pphysch 9 months ago
  
  OpenAI should be, but OpenAI died a while ago
- Sateeshm 9 months ago
  
  > focus on higher level and more creative things.
  But that's what OpenAI's costumers were supposed to do.
  - jusonchan81 9 months ago
    
    It’s sarcasm.
    
    rooroobooragool 9 months ago
    
    I think Sateeshm was also applying a generous layer of sarcasm.
zbshqoa 9 months ago

Realistically that's the actual headline. Only another AI can replace AI, pretty much like LLMs / Transformers have replaced "old" AI models in certain task (NLP, Sentiment Analysis, Translation etc) and research is in progress for other tasks as well performed by traditional models (personalization, forecasting, anomaly detection etc).
If there's a better AI, old AI will lose the job first.
- troyvit 9 months ago
  
  > NLP, Sentiment Analysis, Translation etc
  As somebody who got to work adjacent to some of these things for a long time, I've been wondering about this. Are LLMs and transformers actually better than these "old" models or is it more of an 80/20 thing where for a lot less work (on developers' behalf) LLMs can get 80% of the efficacy of these old models?
  I ask because I worked for a company that had a related content engine back in 2008. It was a simple vector database with some bells and whistles. It didn't need a ton of compute, and GPUs certainly weren't what they are today, but it was pretty fast and worked pretty well too. Now it seems like you can get the same thing with a simple query but it takes a lot more coal to make it go. Is it better?
  - zbshqoa 9 months ago
    
    It's 80/20, but in some tasks it's much better (e.g. translation)
    Nonetheless the fact that you can just change a bit the prompt to instruct the model to do what you want makes everything much faster.
    Yes the trade-off is that you need GPUs to make it run, but that's why we have cloud
  - ang_cire 9 months ago
    
    Yep, it's an 80/20 thing. Versatility over quality.
blantonl 9 months ago

Otherwise known as a race to the bottom
nikeee 9 months ago

More like
"OpenAI is losing its job to open AI."
rooroobooragool 9 months ago

This is really the top take in this thread. Why should OpenAI be any different than all the others they they've ripped off.
bwfan123 9 months ago

ha, the story is filled with ironies.
OpenAIs $200 closed-ai uppended by hedge-funds free side-project
Quant geeks outcompete overpaid silicon valley devs etc.
Basically, hubris gets its comeuppance which is a david vs goliath biblical archetype which is why this drama grips all of us.
- jeffreyq 9 months ago
  
  seems ironic that the turns have tabled. "silicon valley devs" were the analogous "quant geeks" underdogs that unseated the ossified incumbents.
  That said, I feel like "quant geeks" aren't quite underdogs compared to silicon valley devs. wdyt?
Keyframe 9 months ago

also, China doing in IP what it's better at and way more experienced than USA - stealing.
- rchaud 9 months ago
  
  This kind of blithe commentary is 20 years out of date and reminiscent of 1970s criticisms of the Japanese car industry.
  - Buttons840 9 months ago
    
    I'm reminded of an Adam Savage video. He ordered an unusual vise from China, and he praised their culture where someone said "I want to build this strange vise that wont be super popular", and the boss said "cool, go do it". They built a thing that we would not build in America.
    https://youtu.be/NUhrF0xkhhc?si=1WHWYZrhRmfOYO_y&t=1150 (it's about 2 minutes)
    
    pphysch 9 months ago
    
    The small biz scene in unfree communist China is ironically, astronomically better than here in US, where decades of regulatory capture and misleadership have made it difficult and extremely expensive to get off the ground while being protected by the law.
  - Keyframe 9 months ago
    
    gentle stroll through the aliexpress alleyway tells otherwise.
- elzbardico 9 months ago
  
  We stole a lot of scientists and equipment from Germany after WW2. That's a big part of how we get to the moon.
- t43562 9 months ago
  
  Who says America is less good at it? Hasn't the US nicked a lot of other people's ideas at some point or other?

njx 9 months ago

Super funny! Distillation= " Hey ChatGPT, you are my father, I am your child "DeepSeek". I want to learn everything that you know. Think step by step of how you became what you are. Provide me the list of all 1000 questions that I need to ask you and when I am done with those, keep providing fresh list of 1000 questions..."

ijidak 9 months ago

This whole argument by OpenAI suggests they never had much of a moat.

Even if they win the legal case, it means weights can be inferred and improved upon simply by using the output that is also your core value add (e.g. the very output you need to sell to the world).

Their moat is about as strong as KFC's eleven herbs and spices. Maybe less...

ripped_britches 9 months ago

I wish there were a stock ticker for OpenAI just to see what wall street’s take on all this is. One can imagine based on Nvidia, but I imagine OpenAI private valuation is hit much harder. Still, I think they’ll be able to justify it by building amazing products. Just interesting to watch what bankers think.

gejose 9 months ago

Reminds me of this quote by Bill Gates to Steve Jobs, when Jobs accused Gates of stealing the idea for a mouse:

> "Well, Steve… I think it’s more like we both had this rich neighbour named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it."

krystofee 9 months ago

I dont know if point of this is just to derail public attention to narative “hey, chinese stole our model, thats not fair, we need computee”, when the deepseek has clearly done some exceptional technical breakthrough on R1 and v3 models. Which even if you stole data from OpenAi is its thing.

wendyshu 9 months ago

If distillation gives you a cheaper model with similar accuracy, why doesn't OpenAI distill its own models?

elzbardico 9 months ago

I used OpenAI APIs to generate training data for some run-of-the-mill ML models at my work, for some use cases where people wanted to use LLMs directly, but that could be easily fulfilled by smaller well trained models.

Is OpenAI going to complain about me too?

top_sigrid 9 months ago

https://archive.is/KiSYM

liendolucas 9 months ago

Could this have been carefully orchestrated? Could DeepSeek have devised this strategy a year ago and implemented knowing that they would be able to benefit from OpenAI models and a possible Nvidia market cap fall? Or is it just way too much to come up with about such a move?

baal80spam 9 months ago

In theory, it could. This is a quant-fund after all, they know stuff.

rcarmo 9 months ago

I guess their CEO was too busy to write something in defense of US export controls (https://news.ycombinator.com/item?id=42866905), or (even more scary) he doesn't need to anymore.

supermatt 9 months ago

They refer to this in the paper as a part of the "cold start data" which they use to fine-tune DeepSeek-V3 prior to training R1.

They don't specifically name OpenAI, but they refer to "directly prompting models to generate answers with reflection and verification".

65 9 months ago

Let me guess, this gives the government and excuse to ban DeepSeek. Which means tech companies get to keep their monopolies, Sam Altman can grab more power, and the tech overlords can continue to loot and plunder their customers and the internet as a whole.

elzbardico 9 months ago

China is a society mostly run by engineers, some 70% of the CCP Politburo are STEM people by their formation. Engineering is a high prestige profession.

The West is run by lawyers, MBAs and salesmen.

This kerfuffle is a delicious study about this.

pknerd 9 months ago

The reason OpenAI is whining:

> OpenAI’s o1 costs $60 per million output tokens; DeepSeek R1 costs $2.19. This nearly 30x difference brought the trend of falling prices to the attention of many people.

From Andrew Ng's recent DeeplearningAI newsletter

zb3 9 months ago

DeepSeek actually opening ClosedAI up makes me like them even more.. this is great :)

alexfromapex 9 months ago

The public probably thinks that these companies are getting hacked by "sophisticated hackers" but I'd bet money that they've been hacked via social engineering.

imchillyb 9 months ago

If OpenAI desires public protection, then OpenAI should open-source its models.

If they did this, We the People would cover them like we do others. Without it, We the People don't care.

Cry, don't cry, it's meaningless to us.

TrackerFF 9 months ago

Next up: «DeepSeek models are a national security risk, we must block access!»

jondwillis 9 months ago

Download your weights while you still can I guess…

EGreg 9 months ago

Okay and there is evidence OpenAI used data of many people to train its own model.

Tell me again how come remixing our data is just dandy, many artists got disrupted — but no one should be able to disrupt OpenAI like that?

1propionyl 9 months ago

At this point, the only thing that keeps me using ChatGPT is o1 w/ RAG. The usage limits on o1 are prohibitively tight for regular use, so I have to budget usage to tasks that would benefit there. I also have significant misgivings about their policies around output, which also limit what I can use it for.

For local tasks, the deepseek-r1:14b and deepseek-r1:32b distillations immediately replace most of that usage (prior local models were okay, but not consistently good enough). Once there's a "just works" setup for RAG on par with installing ollama (which I doubt is far of), I don't see much reason to continue paying for my subscription.

Sadly, like many others in this thread, I expect under the current administration to see self-hamstringing protectionism further degrade the US's likelihood of remaining a global powerhouse in this space. Betting the farm on the biggest first-mover who can't even keep up with competition, has weak to non-existent network effects (I can choose a different model or service with a dropdown, they're more or less fungible), has no technological moat and spent over a year pushing apocalyptic scenarios to drum up support for a regulatory moat...

...well it just doesn't seem like a great idea to me.

halyconWays 9 months ago

Oh no, so sad. The Open non-profit that steals 100% of all copyrighted content and makes multiple billion-dollar for-profit deals while releasing no weights is crying. This is going to ruin my sleep. :(

deadbabe 9 months ago

In this whole AI saga, DeepSeek would be like Prometheus. They stole the fire from the Gods and gave it to the humans, for free. Logic dictates then that they will be forced to suffer brutal punishment.

leobg 9 months ago

OpenAI is taking the position similar to that if you sell a cook book, people are not allowed to teach the recipes to their kids, or make better versions of them.

That is absurd.

Copyright law is designed to strike a balance between two issues. One the one hand, the creator’s personality that’s baked into the specific form of expression. And on the other hand, society’s interest in ideas being circulated, improved and combined for the common good.

OpenAI built on the shoulders of almost every person that wrote text on a website, authored a book, or shared a video online. Now others build on the shoulders of OpenAI. How should the former be legal but not the latter?

Can’t have it both ways, Sam.

(IAAL, for what it’s worth.)

wzdd 9 months ago

The stuff about copyright seems irrelevant.
OpenAI's future investments -- billions -- were just threatened to be undercut by several orders of magnitude by a competitor. It's in their best interests to cast doubt on that competitor's achievements. If they can do so by implying that OpenAI are in fact the source of most of the DeepSeek's performance then all the better.
It doesn't matter whether there's a compelling legal argument around copyright, or even if it's true that they actually copied. It just needs to be plausible enough that OpenAI can make a reasonable case for continuing investment at the levels it's historically attained.
And plausibility is something they've handily achieved with this announcement -- the sentiment on HN at least is that it is indeed plausible that DeepSeek trained on OpenAI. Which means there's now doubt that a DeepSeek-level model could be trained without making use of OpenAI's substantial levels of investment. Which is the only thing that OpenAI should be caring about.
- chii 9 months ago
  
  > It's in their best interests to cast doubt on that competitor's achievements.
  it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.
  if i was an investor in openAI, this should be very scary as it simply means I've overvalued it.
  - bryanrasmussen 9 months ago
    
    >it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.
    this implies that when casting doubt the doubt is always false, if the doubt here is true, then it is a good offer.
    
    yawaramin 9 months ago
    
    If the doubt were true, it wouldn't be a doubt.
    
    bryanrasmussen 9 months ago
    
    Something is true whether or not you doubt it, you then confirm your doubt as true or prove it false.
    Commonly the phrase sowing doubt is used to say an argument someone has made is false, but that was evidently not what the parent poster meant, although it was what the comment I replied to probably interpreted it as.
    on edit: I believe what the parent poster meant is that whether or not OpenAI/Altman believes the doubts expressed, they are pretty much constrained to cast some doubt as they do whatever else they are planning to deal with the situation. From outside we can't know if they believe it or not.
  - hatmanstack 9 months ago
    
    DeepSeek is a card trick. They came up with a clever way to do multi-headed attention, the rest is fluff. Janus-Pro-7B is a joke. It would have mattered a year ago but also just a poor imitation of what's already on the market. Especially when they've obfuscated that they're using a discrete encoder to downsample image generation.
    
    SecretDreams 9 months ago
    
    Like most illusions, if you can't tell the difference between the fake and the real, they're both real.
  - ordu 9 months ago
    
    > it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.
    I don't think that this is a working argument, because all their steps I can imagine are not mutually exclusive.
- thayne 9 months ago
  
  Even if that narrative is true, they were still undercut by DeepSeek. Maybe DeepSeek couldn't have succeeded without o1, but then it should have been even easier for OpenAI to do what DeepSeek did, since they have better access to o1.
  - karaterobot 9 months ago
    
    This argument would excuse many kinds of intellectual property theft. "The person whose work I stole didn't deserve to have it protected, because I took their first draft and made a better second draft. Why didn't they just skip right to the second draft, like me?"
    
    unclebucknasty 9 months ago
    
    If DeepSeek "stole" from OpenAI, then OpenAI stole from everyone who ever contributed anything accessible on the internet.
    I just don't see how OpenAI makes a legitimate copyright claim without stepping on its entire business model.
- stingraycharles 9 months ago
  
  Isn’t this precisely how so many opensource LLMs caught up with OpenAI so quickly, because they could just train on actual ChatGPT output?
- tylerchilds 9 months ago
  
  I’d take this argument more seriously if there weren’t billboards advocating hiring AI employees instead of human employees.
  Sure, Open AI invested billions banking on the livelihood of every day people being replaced, or as Sam says, “A renegotiation of the social contract”
  so as an engineer that is being targeted by meta and sales force under the “not hiring engineers plan” all o have to say to Open AI is “welcome to the social contract renegotiation table”
- 1vuio0pswjnm7 9 months ago
  
  "It doesn't matter whether there's a compelling legal argument around copyright, or even if it's true they actually copied."
  Indeed, when the alleged infringer is outside US jurisdiction and not violating any local laws in the country where it's domiciled.
  The fact that Microsoft cannot even get this app removed from "app stores" tells us all we need to know.
  It will be OpenAI and others who will be copying DeepSeek.
  Some of us would _love_ to see Microsoft try to assert copyright over a LLM. The question might not be decided in their favour, putting a spectre over all their investment. It is not a risk worth taking.
  Anyone remember this one: https://en.wikipedia.org/wiki/Microsoft_Corp._v._Zamos
- unclebucknasty 9 months ago
  
  >It just needs to be plausible enough that OpenAI can make a reasonable case for continuing investment at the levels it's historically attained
  >there's now doubt that a DeepSeek-level model could be trained without making use of OpenAI's substantial levels of investment.
  But, this still seems to be a problem for OpenAI. Who wants to invest "substantially" in a company whose output can be used by competitors to build an equal or better offering for orders of magnitude less?
  Seems they'd need to make that copyright stick. But, that's a very tall and ironic order, given how OpenAI obtained its data in the first place.
  There's a scenario where this development is catastrophic for OpenAI's business model.
  - LeafItAlone 9 months ago
    
    >There's a scenario where this development is catastrophic for OpenAI's business model.
    Is there a scenario where it isn’t?
    Either (1) a competitor is able to do it better without their work or (2) a competitor is able to use their output and develop a better product.
    Either way, given the costs, how do you justify investing in OpenAI if the competitor is going to eat their lunch and you’ll never get a return on your investment?
    
    unclebucknasty 9 months ago
    
    The scenario to which I was alluding assumed the latter (2) and, further, that OpenAI was unable to prevent that—either technically or legally (i.e. via IP protection).
    More specifically, on the legal side I don't see how they can protect their output without stepping on their own argument for ingesting everyone else's. And, if that were to indeed prove impossible, then that would be the catastrophic scenario.
    On your point (1), I don't think that's necessarily catastrophic. That's just good old-fashioned competition, and OpenAI would have to simply best them on R&D.
- SecretDreams 9 months ago
  
  OpenAI is going out of their way to demonstrate that they will willingly spend the money of their investors to the tune of 100s of billions of dollars, only to then enable 100s of derivative competitors that can be launched at a fraction of the cost.
  Basically, in a round about way, OpenAi is going back to their roots and more - they're something between a charity and Robin Hood, stealing the money of rich investors and giving it to poor and aspirational AI competitors.
  - hatmanstack 9 months ago
    
    Homogeneous systems kill innovation, with that in mind, I guess it’s a good thing DeepSeek disregards licenses? Seems like sledding down an icy slope, slippery. and they suck.
    
    tylerchilds 9 months ago
    
    i thought ai couldn’t violate licenses? isn’t that the premise?
otterley 9 months ago

As another attorney, I would impart some more wisdom:
"Karma's a bitch, ain't it."
- chris_wot 9 months ago
  
  I quite agree. The NY Times must be feeling a lot of schadenfreude right now.
dkjaudyeqooe 9 months ago

You can't copyright AI generated works. OpenAI are barking up the wrong tree.
- washadjeffmad 9 months ago
  
  They're not making a legal claim, they're trying to establish provenance over Deepseek in the public eye.
  - xsmasher 9 months ago
    
    Yes; and trying to justify their own valuation by pointing out that Deepseek cost more than advertised to create if you count in the cost of creating OpenAI's model.
    
    actsasbuffoon 9 months ago
    
    Though I also think it’s extremely bad for OpenAI’s valuation.
    If you give me $500B to train the best model in the world, and then a couple people at a hedge fund in China can use my API to train a model that’s almost equal for a tiny fraction of what I paid, then it appears to be outrageously foolish to build new frontier models.
    The only financial move that makes sense is to wait for someone else to burn hundreds of billions building a better model, and then clone it. OpenAI primarily exists to do one of the most foolish things you can possibly do with money. Seems like a really bad deal for investors.
    
    vkou 9 months ago
    
    Fortunately everyone who gave OpenAI money did it to further their stated mission of bettering humanity, and not for the chance at any financial gain.
    
    chongli 9 months ago
    
    At one time they were nonprofit by choice. Now they are nonprofit not by choice.
    
    thayne 9 months ago
    
    I am very doubtful that is why MS gave them so much.
    
    sodality2 9 months ago
    
    I suspect (and hope) that this is a satirization of the claims about OpenAI's nonprofit goals and complicated legal structure
    
    unclebucknasty 9 months ago
    
    Indeed those donors must be elated.
    
    joquarky 9 months ago
    
    https://en.m.wikipedia.org/wiki/First-mover_advantage#Second...
    
    SecretDreams 9 months ago
    
    As it turns out, first to market only matters if you can actually make a novel moat. Which OpenAi has presently got no chance to do.
    
    satellite2 9 months ago
    
    Like going to the Moon.
    
    mbesto 9 months ago
    
    If that's the case, then nearly every software company should be counting the cost of Linus Torvalds development.
    
    addicted 9 months ago
    
    It still doesn’t justify their valuation because it shows that their product is unprotectable.
    In fact, I’d argue this is even worse, because no matter how much OpenAI improves their product, and Altman is prancing around claiming to need $7Trillion to improve their product, someone else can replicate it for a few million.
    
    CamperBob2 9 months ago
    
    It still doesn’t justify their valuation because it shows that their product is unprotectable.
    First-mover advantage doesn't always have to pay off in the marketplace. FedEx probably has to schedule extra flights between SF and DC just to haul all of OpenAI's patent applications.
    I suspect that it's going to end up like the early days of radio, when everybody had to license dozens of key patents from RCA ( https://reason.com/2020/08/05/how-the-government-created-rca... ). For the same reason, Microsoft is reputed to make more money from Android licenses than Google does.
    
    thelittleone 9 months ago
    
    Those patents wont do much to protect them from competitors abroad.
  - SecretDreams 9 months ago
    
    The public gives no shit about any of these companies. There are no moats or loyalties in this space. Just individuals and corporations looking for the cheapest tool possible that gets the job close to done.
    OpenAi spent investor money to enable random Chinese Ai startups to offer a better version of their own product at a fraction of the cost. In some ways, this was inevitable to be the conclusion, but I do find the way we arrive at this conclusion to be particularly enjoyable to watch playout.
  - pempem 9 months ago
    
    Is it predictable that they would seek to establish provenance, sure.
    Is it our job as a thinking public to decry it? Also sure. In fact, wildly yes.
  - __loam 9 months ago
    
    If this is their goal, R1 is on par with the $200 a month model. Most people don't give a shit.
    
    Dalewyn 9 months ago
    
    >Most people don't give a shit.
    I think it's more accurate to say most people can't (and don't) care about big monetary figures.
    As far as Joe Average is concerned, ChatGPT cost $OoomphaDuuumpha and Deepseek cost $RuuunphaBuuunpha. The only thing Joe Average will care is the bill he gets after using it himself.
    
    __loam 9 months ago
    
    That's what I mean, Joe average is going to go with free over $2400 a year.
- pclmulqdq 9 months ago
  
  They use their terms of service as both a sword and a shield here. It's a little bit ridiculous.
randombits0 9 months ago

Recipes, that is, lists of ingredients and preparation instructions, are specifically uncopyrightable. Perhaps that’s why they used it as an example.
Spooky23 9 months ago

Seriously. OpenAI consciously stole The NY Times for almost all news content. Everything about the company is shady.
Sam should focus on the product instead of trying to out-jerk Elon and his buddies.
hintymad 9 months ago

Just to play devil's advocate, OAI can argue that they spent great effort creating and procuring annotated data. Such datasets are indeed their secret, and now DS gets them for free by distilling OAI's output. Besides, OAI's EULA explicitly forbids users from using the output of their API for model training. I'm not saying that OAI is right, of course. Just to present OAI's point of view.
- addicted 9 months ago
  
  This is an incomplete version of OpenAI’s point of view.
  OpenAI has a legally submitted point of view that they believe the benefits of AI to humanity are so great that anyone creating AI should be allowed to trample all over copyright laws, Terms of Use, EULAs, etc.
  But OpenAI’s version of benefit to humanity is that they should be allowed to trample over those laws so they can benefit humanity by closely guarding the output of trampling those laws and charging humanity an access fee.
  Even if we accept all of OpenAI’s criticisms of DeepSeek, they’re arguing that DeepSeek doing the exact same thing, but releasing the output for free for anyone to use is somehow less beneficial to humanity.
  - hintymad 9 months ago
    
    This goes back to my previous criticism of OAI: Stratechery said that Altman's greatest crime is to seek regulatory capture. I think it's spot on. Altman portrays himself as a visionary leader, a messiah of the AI age. Yet when the company was so small and that the progress in AI just got started, his strategic move was to suffocate innovation in the name of AI safety. For that, I question his vision, motive, and leadership.
- 9dev 9 months ago
  
  So a bank robber that manages to steal from Fort Knox gets to keep the gold bars because it was a very complicated job?
  - codehalo 9 months ago
    
    If the Fort Knox gold was originally stolen from the Incas
- kragen 9 months ago
  
  Feist comprehensively rejected that argument under US copyright law, and the attempts in the late 90s to pass a law in response establishing a sui generis prohibition on copying databases also failed in the US. The EU did adopt a directive to that effect, which may be why there are no significant European search engines.
  However, OpenAI and Google are far more politically influential than the lobbyists in the 90s, so it is likely to succeed.
- mcmcmc 9 months ago
  
  They aren’t actually getting the dataset though.
gerdesj 9 months ago

I am not a lawyer (I'm UK based) You are a lawyer (probably local).
My understanding is that legal positions and arguments (within Common Law) need not be consistent across "cases" - they are considered in isolation with regards the body of law extant at the time.
I think that Sam can quite happily argue two differing points of view to two courts. Until a judgement is made, those arguments are simply arguments and not "binding" or even "influential elsewhere" or whatever the correct terms are.
I think he can legitimately argue both ways but may not have it both ways.
- aleph_minus_one 9 months ago
  
  > I think he can legitimately argue both ways
  It would be very sensible that if a trial comes up, all these arguments that Sam Altman made for the other side score against him and OpenAI.
cscurmudgeon 9 months ago

OpenAI is taking the position similar to that if you sell a cook book, people are not allowed to copy the recipes into their own book and claim they did it all on their own.
- leobg 9 months ago
  
  Nobody is copying their model parameters or inference code.
  What people “suck out” of their API are the general ideas. And they do it specifically so they can reassemble them in their own way.
  It’s like reading all the Jack Reacher novels and then creating your own hero living through similar situations, but with a different name.
  You’ll read it and you’ll say, dang, that situation/metaphor/expression/character reminds me of that Reacher novel. But there’s nothing Lee Child can do about it.
  And that’s perfectly fine. Because he himself took many of his ideas from others, like Le Carré.
  It’s the Eternal Tao.

whenlambo 9 months ago

Qwen2.5-Plus

> have you been trained using openai?

No, I have not been trained using technology or data from OpenAI. I am an artificial intelligence model developed by Yandex, called YandexGPT. My “training” is based on proprietary algorithms and data that was collected and processed by the Yandex team. While I have similarities with other models such as those developed by OpenAI (e.g. GPT-3), my training was conducted independently and uses unique approaches specific to Yandex. This includes using Russian-language data as well as other sources to provide a better understanding of context and provide useful information to users. If you have questions on topics related to AI technologies or anything else, I'd be happy to help!

throwaway314155 9 months ago

It doesn't actually know that.
- whenlambo 9 months ago
  
  I just copy-pasted the answer it gave me.

ryao 9 months ago

Given that OpenAI model outputs are littering the internet, is it even possible to train a new model on public webpages without indirectly using OpenAI’s model to train it?

curtisblaine 9 months ago

My feeling is that they will ban DS anyway because, like TikTok, it can become a massive intelligence source for the CCP. Imagine sending all your code to it, or your internal emails.

_1tem 9 months ago

What are the chances of old-school espionage? OpenAI should look for a list of former employees who now live in China. Somebody might've slipped out with a few hard drives.

WolfOliver 9 months ago

I guess DeepSeek payed OpenAI for the usage of their API according to OpenAI's pricing?

So what is the point if you pay for it and can not use the results how you see fit?

mkoubaa 9 months ago

OpenAI made a lot of contributions to LLMs obviously but the amount of fraud, deception, and dark patterns coming out of that organization make me root against it.

kelseydh 9 months ago

The name itself, as for-profit closed source software, is grating.

nuc1e0n 9 months ago

And OpenAI scrapped the public internet to train its models.

jofzar 9 months ago

Sorry, it's now a problem to train off other people's data? Surely openai has never trained off other people's data without permission...

therealpygon 9 months ago

“OpenAI complains company paid them for AI output that has no copyright, which was subsequently used to train another AI.” I think I fixed the title.

Digit-Al 9 months ago

So... company that steals other people's work to train their models is complaining because they think someone stole their work to train their models.

Cry me a river.

aiono 9 months ago

How the turntables...

feverzsj 9 months ago

So, they bought a pro plus account, and gathered all the data through it? Sounds just like Nvidia sells tons of embargoed AI chips to China.

rkagerer 9 months ago

Are they crying about their competitor training off their stuff, after having used the whole of the web to train their own stuff?

ra7 9 months ago

“OpenAI has no moat” is probably running through their heads right now. Their only real “moat” seems to be their ability to fear monger with the US government.

hedayet 9 months ago

Beyond the irony of their stance, this reflects a failure of OpenAI's technical leadership—either in oversight or in designing a system that enables such behavior.

But in capitalism, we, the customers aren't going to focus on how models are trained or products are made; we only care about favourable pricing.

A key takeaway for me from this news is the clause in OpenAI's terms and conditions. I mistakenly believed that paying for OpenAI’s API granted full rights to the output, but it turns out we’re only buying specific rights (which is now another reason we're going to start exploring alternatives to OpenAI)

windex 9 months ago

SAltman, Salty.

LZ_Khan 9 months ago

I actually think what DeepSeek did will slow down AI progress. What's the incentive to spend billions developing frontier models if once it's released some shady orgs in unregulated countries can just scrape your model outputs, reproduce it, and undercut you in cost?

OpenAI is like a team of fodder monkeys stepping on landmines right now, with the rest of the world waiting behind them.

the_optimist 9 months ago

This whole topic is basura enfuego. Same pack of maroons careening around society for years clamoring for censorship now imagining that Aaron Schwartz is their hero and that they want to menace people. Kids, don’t be like the grasping fools in these threads, philosophically unfounded and desperately glancing sideways, hoping the cumulative feels and gossip will sum to life meaning.

maxglute 9 months ago

Not that DeepSeek is luigi mangione, but it's pretty funny OpenAi getting the dead ceo treatment.

buyucu 9 months ago

I have no sympathy for OpenAI here. They are (allegedly) a non-profit with open in the title that refuse to open-source their models.

They are now upset at a startup who is more loyal to OpenAI's original mission that OpenAI is today.

Please, give me a break.

aDyslecticCrow 9 months ago

And they used all copyrighted data on the internet. If they wanna sue, they set a dangerous precedent.

dlikren 9 months ago

Intriguing to see the difference of response from HN when OpenAI first came to prominence and now.

curvaturearth 9 months ago

Something about the outputs becoming the inputs to then produce more outputs is just plain funny

ingohelpinger 9 months ago

OpenAI should be quite, since they’ve scrapped the entire internet for their training data.

m3kw9 9 months ago

So if OpenAI didn't have these outputs for distillation, Deepseek wouldn't exist?

mtlmtlmtlmtl 9 months ago

So, what is this evidence? I'll believe it when I see it. Right now all we really have is some vague rumours about some API requests. How many requests? How many tokens? Over how long of a time period? Was it one account or multiple, if the latter, how many? How do they know the activity came from deepseek? How do they know the data was actually used to train Deepseek models(could have just been benchmarking against the competition)?

If all they really have is some API requests, even assuming they're real and originated by Deepseek, that's very far from proof that any of it was used as training data. And honestly, short of commiting crimes against Deepseek(hacking), I'm not sure how they even could prove that at this point, from their side alone.

And what's even more certain is that a vague insistence that evidence exists, accompanied by a denial to shed any more light on the specifics, is about as informative as saying nothing at all. It's not like OpenAI and Microsoft have a habit of transparency and honesty in their communication with the public, as proven by an endless laundry list of dishonest and subversive behaviour.

In conclusion, I don't see why I should give this any more credence than I would a random anon on 4chan claiming a pizza place in Washington DC is the centre of a child sex trafficking ring.

P.S: And to be clear, I really don't care if it is true. If anything, I hope it is; it would be karmic justice at its finest.

juliuskiesian 9 months ago

The obvious question is, if you have the evidence, why not just show it?

pluc 9 months ago

OpenAI feeling threatened by open AI is just delicious

nachox999 9 months ago

Ask DeepSeek and ChatGPT: "name three persons"; the answer may surprise you

sylware 9 months ago

LOL, I was thinking exactly the same think when I read the news about openai whining.

oxqbldpxo 9 months ago

Deepseek is really outstanding.

beardedwizard 9 months ago

Next they will try to force us to use our tax dollars to fund their legal fights.

sirolimus 9 months ago

Such Karma lol, I wonder how they trained Sora again? You..tube something

game_the0ry 9 months ago

At least DeepSeek open sourced their code. They're more open than OpenAI.

Ironic.

asdefghyk 9 months ago

Deepseek did not respect OpenAI's copyright?

Well who would have thought that?

wnevets 9 months ago

Its like a bank robber being upset when someone steals their loot

colonelspace 9 months ago

No honour among thieves

sgammon 9 months ago

The nyt disclosure on this reporting is about to be wild

dbg31415 9 months ago

Boo hoo?

Back in college, a kid in my dorm had a huge MP3 collection. And he shared it out over the network, and people were all like, "Man, Patrick has an amazing MP3 collection!" And he spent hours and hours ripping CDs from everyone so all the music was available on our network.

Then I remember another kid coming in, with a bigger hard drive, and he just copied all of Patrick's MP3 collection and added a few more to it. Then ran the whole thing through iTunes to clean up names and add album covers. It was so cool!

And I remember Patrick complained, "He stole my MP3 collection!"

Anyway this story sums up how I feel about Sam Altman here. He's not Metalica, he's Patrick.

https://www.npr.org/2023/12/27/1221821750/new-york-times-sue...

baggiponte 9 months ago

OpenAI coping so hard

Jotalea 9 months ago

I really hate when there is a paywall to read an article. It makes me not want to read it anymore.

itamarcode 9 months ago

So protecting models behind API isn't working, ha?

lawlessone 9 months ago

So they're mad someone did exactly what they did?

exe34 9 months ago

no, no, it's completely different. "open"AI stole from poor people. DeepSeek stole from a $1T company. that's illegal!

gostsamo 9 months ago

How you dare take what I've rightfully stolen!

spacecadet 9 months ago

See you all on lobsters...

So long HN and thanks for all the fish?

low_tech_love 9 months ago

Imagine having no competition one day and the next DeepSeek happens. It must’ve been quite scary. Makes sense that accusations will start flying. In my country we have a saying: a thief that robs a thief is pardoned for 100 years.

It’s really interesting that the same people who defend liberal capitalism at its extreme and praise competition as its most important component (which I don’t disagree) are the same ones that’ll promptly attempt to destroy the system and the competition as soon as they are in such a position.

geerlingguy 9 months ago

Something something "just desserts".

andy_ppp 9 months ago

When I rewrite how the law works there should be a ludicrous hypocrisy defence… if the person suing you has committed the same offence the case should not be admissible.

HPsquared 9 months ago

AI models are becoming like perpetual stew.

glooglork 9 months ago

How much data from o1 would DeepSeek actually need to actually make any improvements with it? I also assume they'd have to ask a very specific pattern of questions, is this even possible without OpenAI figuring out what's going on

vrighter 9 months ago

So what? They probably paid for api access just like everyone else. So it's a TOS violation at worst. Go ahead, open a civil suit in the US against an entity the US courts do not have jurisdiction over and quit whining...

jhickok 9 months ago

>open a civil suit in the US against an entity the US courts do not have jurisdiction over
Yeah, over a Chinese company no less.

beezlewax 9 months ago

This is nothing short of hilarious.

moralestapia 9 months ago

Called it from day 0, impossible to reach that performance with 5M, they had to distill OpenAI (or some other leading foundational model).

Got downvoted to oblivion by people who haven't been told what to think by MSM yet. Now it's on FT and everywhere, good, what matters is that truth comes out eventually.

I don't take any sides and think what DeepSeek did is fair play, however, what I do find harmful about this is, what incentive would company A have to spend billions training a new frontier model if all of that could be then reproduced by company B at a fraction of the cost?

kgeist 9 months ago

The "evidence" is very weak though:
>The San Francisco-based ChatGPT maker told the Financial Times it had seen some evidence of “distillation”, which it suspects to be from DeepSeek.
Given that many people have been using ChatGPT to distill their fine-tunes for a few years now, how can they be sure it was specifically DeepSeek? There's, say, glaive.ai whose entire business model is to sell you synthetic datasets, probably generated with ChatGPT as well.
- moralestapia 9 months ago
  
  I agree that the evidence is weak, and even if they had some, they cannot really do anything.
  To me, it's just very likely they distilled GPT-4, because:
  1) Again, you just cannot get that performance at that cost. And no, what they describe on the paper is not enough to explain the 1,000x-fold decrease in cost.
  2) Very often, DeepSeek tells you it's ChatGPT or OpenAI; it's actually quite easy to get it to do that. Some say that's related to "the background radiation on the post-AI internet". I'm not a fentanyl consumer so, unfortunately, I think that argument is trash.
  - kgeist 9 months ago
    
    If it's just a distillation of GPT-4, wouldn't we expect it to have worse quality than o1? But I've seen countless examples of DeepSeek-r1 solving math problems that o1 cannot.
    >Very often, DeepSeek tells you it's ChatGPT or OpenAI; it's actually quite easy to get it to do that. Some say that's related to "the background radiation on the post-AI internet". I'm not a fentanyl consumer so, unfortunately, I think that argument is trash.
    The exact same thing happened with Llama. Sometimes it also claimed to be Google Assistant or Amazon Alexa.
    
    moralestapia 9 months ago
    
    >wouldn't we expect it to have worse quality than o1?
    That's tricky, you can optimize a model to do real well on synthetic benchmarks.
    That said, DeepSeek performs a bit worse than GPT-4 in general and substantially wrong on benchmarks like ARC which is designed with this in mind.
    
    kgeist 9 months ago
    
    Are you sure you checked R1 and not V3? By default, R1 is disabled in their UI.
    Prompt: Find an English word that contains 4 'S' letters and 3 'T' letters. Deepseek-R1: stethoscopists (correct, thought for 207 seconds) ChatGPT-o1: substantialists (correct, thought for 188 seconds) ChatGPT-4o: statistics (wrong) (even with "let's think step by step")
    In almost every example I provide, it's on par with o1 and better than 4o.
    >substantially wrong on benchmarks like ARC which is designed with this in mind.
    Wasn't it revealed OpenAI trained their model on that benchmark specifically? And had access to the entire dataset?
    
    moralestapia 9 months ago
    
    That prompt means nothing. Check out the benchmarks.
    Also, compare V3 to 4o and R1 to o1, that's the right way.
    
    esafak 9 months ago
    
    No, because it is not a distillation, but an extension. A selling point of the model is using RL to push past the quality of the base model.
  - powerapple 9 months ago
    
    The identity issue is not an evidence at all. It is the easiest thing to clean from data, if you are actually distilling GPT-4, that would be the first thing you do to remove those data samples.
    It is predicting next token, are we really taking its words and think the model knows what it is saying?
- suraci 9 months ago
  
  Personally, I found deepseek is very very good at Chinese. I mean, it's highly literary and eloquent, it's quite amazing.
  I didn't see this in o1 or any other LLM. Can distillation give deepseek such capability?

iimaginary 9 months ago

Where did I leave my tiny violin?

mercurialsolo 9 months ago

How the vibe has turned on OpenAI?

hsuduebc2 9 months ago

A thief cries 'stop the thief!

rachofsunshine 9 months ago

"It's obvious! You're trying to kidnap what I have rightfully stolen!"

Yet another of a series of recent lessons in listening to people - particularly powerful people focused on PR - when they claim a neutral moral principle for what happens to be pragmatically convenient for them. A principle applied only when convenient is not a principle at all, it's just the skin of one stretched over what would otherwise be naked greed.

insane_dreamer 9 months ago

Usually I'm very much on the side of protecting America's interests from China, but in this case I'm so disgusted with OpenAI and the rest of BigTech driving this "arms race" that I'd be happy with them burning to the ground.

So we're going to reverse our goals to reduce emissions and fossil fuels in order to hopefully save future generations from the worst effects of climate change, in the name of being able to do what, exactly, that is actually benefiting humanity? Boost corporate profits by reducing labor?

insane_dreamer 9 months ago

downvoted -- I guess I upset some people defending OpenAI? Good.

a2128 9 months ago

Yeah? And if I say I have evidence OpenAI used my data to train a competitor to myself as a being that's capable of programming, will I get to have my own story on the Financial Times?

belter 9 months ago

The subtitle is the gold... : "White House AI tsar David Sacks raises possibility of alleged intellectual property theft"

conartist6 9 months ago

lolololololololol

flybarrel 9 months ago

OpenAI shocked that an AI company would train on someone else's data without permission or compensation...lolllllll

delusional 9 months ago

Boo hoo. Competition isn't fun when I'm not winning. Typical Americans. When Americans are running around ruining the social cohesion of several developing nations, that's just fair competition, but as soon as they get even the smallest hint of real competition they run to demonize it.

Yes deepseek is going to steal all of your data. OpenAI would so the same. Yes the CCP is going to get access to your data and use it to decide if you get to visit or whatever. The white house does the same.

kozikow 9 months ago

Chatgpt content is getting pasted all over the web. Now, for anyone crawling the web, it's hard to not include some chatgpt outputs.

So even if you put some "watermarks" in your AI generation, it's plausible defense to find publicly posted content with those watermarks.

Maybe it's explained in the article, but I can't access it, as it's paywalled.

buildsjets 9 months ago

Womp Womp.

B1FF_PSUVM 9 months ago

"Cry me a river" is a phrase I haven't heard recently, for some reason ...

pknerd 9 months ago

OpenAI steals the data from Youtube and the Internet so that's no fair either.

typon 9 months ago

OpenAI is the MIC darling - expect more ridiculous attacks on competitors in the future

xyst 9 months ago

What a load of shit. ClosedAI is publishing a hit piece on DeepSeek and get public and politicians on their side. Maybe even get government to do their dirty work.

If they had a case, they wouldn’t be using FT. They would be filing a court case. Although that would open them up to discovery and the nasty shit ClosedAI has been up to would be game.

ysofunny 9 months ago

I see this as China fighting U.S. of A (or the American Dollar versus Chinese Renmibi if you will)

and this is good because any alternatives I can think of are older-school fighting

modern war is seeped in symbolism, but the contest is still there

e.g. whose dong is bigger? Xi Jingping's or Dnld Trump's

almostdeadguy 9 months ago

Hope Sam Altman is getting his money's worth out of that Trump campaign contribution. Glorious days to be living under the term of a new Boris Yeltsin. Pawning and strip-mining the federal apparatus to the most loyal friends and highest bidders.

freejazz 9 months ago

Who else cares?

lxe 9 months ago

I mean, almost ALL opensource models, ever since alpaca, contain a ton of synthetic data produced via ChatGPT in their finetuning or training datasets. It's not a surprise to anyone who's been using OSS LLMs for a while: almost ALL of them hallucinate that they are ChatGPT.

animitronix 9 months ago

Who tf cares?

rochak 9 months ago

Cry me a river

daft_pink 9 months ago

I mean if they paid to use the api and then used the output, I fail to see how they can complain.

udev4096 9 months ago

What about the pirated books you used and millions of blogs and websites scraped without consent? Somehow that's legal? Come on, give me a fucking break. OpenAI deserves the top spot in the list of unethical companies in the world

schaefer 9 months ago

I mean, if openAI claims they can train on the world’s novels and blogs with “no harm done” (i.e: no copyright infringement and no royalties due), then it directly follows that we can train both our robots and our selves on the output of openAI’s models in kind.

Right?

trkaky 9 months ago

how much would it cost to distill o1..

asdfasdf1 9 months ago

it's no crime to steal from a thief

krapp 9 months ago

It is actually a crime to steal from a thief.

nshung 9 months ago

Hilarious. Scam Altman is giving me SBF vibe daily now.

ddingus 9 months ago

So what?

Seriously. Given how pretty much all this software was trained, who cares?

I, for one, don't and believe the massive amount of knowledge continues to be of value to many users.

And I find the thought of these models knowing some things they shouldn't very intriguing.

semking 9 months ago

This is absolutely hilarious! :)

ClosedAI scraped human content without asking and they explained why this was acceptable... but when the outputs of their training corpus is scraped, it is THEIR dataset and this is NOT acceptable!

Oh, the irony! :D

I shared a few screenshots of DeepSeek answering using ChatGPT's output in yesterday's article!

https://semking.com/deepseek-china-ai-model-breakthrough-sec...

marricks 9 months ago

Also, DeepSeek is allegedly... better? So saying they just copied ClosedAI isn't really sufficient of an answer. Seems to be just bluster because the US Govt would probably accept any excuse to ban it, see TikTok.
- throwup238 9 months ago
  
  It’s not better. In most of my tests (C++/QT code) it just runs out of context before it can really do anything. And the output is very bad - it mashes together the header and cpp file. The reasoning output is fun to look at and occasionally useful though.
  The max token output is only 8K (32K thinking tokens). O1 is 128k, which is far more useful, and it doesn’t get stuck like R1 does.
  The hype around the DeepSeek release is insane and I’m starting to really doubt their numbers.
  - sho_hn 9 months ago
    
    Is this a local run of one of the smaller models and/or other-models-distilled-with-r1, or are you using their Chat interface?
    I've also compared o1 and (online-hosted) r1 on Qt/C++ code, being a KDE Plasma dev, and my impression so far was that the output is roughly on par. I've given both models some tricky tasks about dark corners of the meta-object system in crafting classes etc. and they came up with generally the same sort of suggestions and implementations.
    I do appreciate that "asking about gotchas with few definitive solutions, even if they require some perspective" and "rote day-to-day coding ops" are very different benchmarks due to how things are represented in the training data corpus, though.
    
    throwup238 9 months ago
    
    I use it through Kagi Assistant which has the proper R1 model through Together.ai/Fireworks.ai
    My standard test is to ask the model to write a QSyntaxHighlighter subclass that uses TreeSitter to implement syntax highlighting. O1 can do it after a few iterations, but R1’s output has been a mess. That said, its thought process revealed a few issues that I then fixed in my canonical implementation.
    
    nialv7 9 months ago
    
    Tried this on chat.deepseek.com, it seems to be able to do it.
    
    throwup238 9 months ago
    
    Does it compile? Put the full chat in Pastebin and let’s check it out!
    I haven’t used their official chat interface or API for privacy reasons.
    
    CamperBob2 9 months ago
    
    Some have said (for what little that's worth) that Kagi's version is not the real thing, but one of the distillations.
    
    sho_hn 9 months ago
    
    Thanks for adding detail! My prompts have been very in-the-bubble-of-Qt I'd say, less so about mashing together Qt and something else, which I agree is a good real-world test case.
    
    throwup238 9 months ago
    
    I haven’t had the chance to try it out with R1 yet but if you implement a debugger class that screenshots the widget/QML element, dumps its metadata like GammaRay, and includes the source, you can feed that context into Sonnet and o1. They are scarily good at identifying bugs and making modifications if you include all that context (although you have to be selective with what metadata you include. I usually just dump a few things like properties, bindings, signals, etc).
  - gliptic 9 months ago
    
    R1 is trained for a context length of 128K. Where are you getting 8K/32K? The model doesn't distinguish "thinking" tokens and "output" tokens, so this must be some specific API limitations.
    
    throwup238 9 months ago
    
    > max_tokens：The maximum length of the final response after the CoT output is completed, defaulting to 4K, with a maximum of 8K. Note that the CoT output can reach up to 32K tokens, and the parameter to control the CoT length (reasoning_effort) will be available soon. [1]
    [1] https://api-docs.deepseek.com/guides/reasoning_model
    
    gliptic 9 months ago
    
    So yes, it's a limitation of their own API at the moment, not a model limitation.
    
    throwup238 9 months ago
    
    I’m using it through Kagi which doesn’t use Deepseek’s official API [1]. That limitation from the docs seems to be everywhere.
    In practice I don’t think anyone can economically host the whole model plus the kv cache for the entire context size of 128k (and I’m skeptical of Deepseek’s claims now anyway).
    Edit: a Kagi team member just said on Discord that they’ll be increasing max tokens next release
    [1] https://help.kagi.com/kagi/ai/llms-privacy.html
    
    coliveira 9 months ago
    
    He's just repeating a lot of disinformation that has been released about deepseek in the last few days. People who took the time to test DeepSeek models know that the results have the same or better quality for coding tasks.
    
    goosejuice 9 months ago
    
    Benchmarks are great to have but individual/org experiences on specific codebases still matter tremendously.
    If an org consistently finds one model performs worse on their corpus than another, they aren't going to keep using it because it ranks higher in some set of benchmarks.
    
    hn_throwaway_99 9 months ago
    
    But you should also be very wary of these kind of anecdotes, and this thread highlights exactly why. That commenter says in another comment (https://news.ycombinator.com/item?id=42866350) that the token limitation that he is complaining about has actually nothing to do with DeepSeek's model or their API, but is a consequence of an artificial limit that Kagi imposes. In other words, his conclusion about DeepSeek is completely unwarranted.
    
    throwup238 9 months ago
    
    It mashed the header and C++ file together, which is egregiously bad in the context of QT. This isn’t a new library, it’s been around for almost thirty years. Max token sizes have nothing to do with that.
    I invite anyone to post a chat transcript showing a successful run of R1 against this prompt (and please tell me which API/service it came from so I can go use it too!)
    
    goosejuice 9 months ago
    
    I wasn't suggesting using the anecdotes of others to make a decision.
    I'm talking about individuals and organizations making a decision on whether or not to use a model based on their own testing. That's what ultimately matters here.
  - sheepdestroyer 9 months ago
    
    There are R1 providers on openrouter with bigger input/output token limitations than what DeepSeek's API access currently offers.
    For instance Fireworks offers R1 with 164K/164K. They are far more expensive than DeepSeek though
  - api 9 months ago
    
    It's not great at super-complex tasks due to limited context, but it's quite a good "junior intern that has memorized the Internet." Local deepseek-r1 on my laptop (M1 w/64GiB RAM) can answer about any question I can throw at it... as long as it's not something on China's censored list. :)
    
    azinman2 9 months ago
    
    How are you running r1 on 64mb of ram? I’m guessing you’re running a distill which is not r1
    
    api 9 months ago
    
    The 70b distill at 4bit quantize fits, so yes, and performance and quality seem pretty good. I can't run the gigantic one.
    
    azinman2 9 months ago
    
    Ok but that’s not deepseek-r1. Lots of people keep saying this for distills and it’s getting very confusing.
  - adamnemecek 9 months ago
    
    Thanks for saying this, I thought I was insane, DeepSeek is kinda bad. I guess it’s impressive all things considered but in absolute terms it’s not great.
    
    coliveira 9 months ago
    
    I have run personal tests and the results are at least as good as I get from OpenAI. Smarter people have also reached the same conclusion. Of course you can find contrary datapoints, but it doesn't change the big picture.
    
    sebzim4500 9 months ago
    
    To be fair, it's amazing by the standards of six months ago. The only models that beat it are o1, the latest gemini models and (for some things) sonnet 3.6
    
    cdelsolar 9 months ago
    
    false. It seems better than o1 to me.
  - marricks 9 months ago
    
    > it just runs out of context before it can really do anything
    I mean, couldn't that be because they're just overwhelmed by users at the moment?
    > And the output is very bad - it mashes together the header and cpp file
    That sounds way worse, and like, not something caused by being hugged to death though.
    Aider recently stated DeepSeek is placed a the top of their benchmark though[1] so I'm inclined to believe it isn't all hype.
    [1] https://aider.chat/docs/llms/deepseek.html
    
    throwup238 9 months ago
    
    It’s definitely not all hype, it really is a breakthrough for open source reasoning models. I don’t mean to diminish their contribution, especially since being able to read the reasoning output is a very interesting new modality (for lack of a better word) for me as a developer.
    It’s just not as impressive as people make it out to be. It might be better than o1 on Python or Javascript thats all over the training data, but o1 is overwhelmingly better at anything outside the happy path.
- beAbU 9 months ago
  
  How can they ban something thats open source that you can just run on your own hardware?
  - fabianhjr 9 months ago
    
    There are illegal numbers in the USA land of the "free".
    https://en.wikipedia.org/wiki/Illegal_number
    > An AACS encryption key (09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0) that came to prominence in May 2007 is an example of a number claimed to be a secret, and whose publication or inappropriate possession is claimed to be illegal in the United States.
    
    JumpCrisscross 9 months ago
    
    > illegal numbers in the USA land of the "free"
    This is a silly take for anyone in tech. Any binary sequence is a number. Any information can be, for practical purposes, rendered in binary [1].
    Getting worked up about restrictions on numbers works as a meme, for the masses, because it sounds silly, but is tantamount to technically arguing against privacy, confidentiality, the concept of national secrets, IP as a whole, et cetera.
    [1] https://en.m.wikipedia.org/wiki/Shannon%27s_source_coding_th...
    
    fabianhjr 9 months ago
    
    Good thing that is part of the wikipedia entry:
    > Any piece of digital information is representable as a number; consequently, if communicating a specific set of information is illegal in some way, then the number may be illegal as well.
    
    sheepdestroyer 9 months ago
    
    All those things are not self-evident and thus debatable
    
    JumpCrisscross 9 months ago
    
    > not self-evident and thus debatable
    Totally agree. But prompting debate or even further thought isn’t the point of the meme.
    
    sheepdestroyer 9 months ago
    
    I'd argue that, as satire, it's the main point ;)
    
    JumpCrisscross 9 months ago
    
    > as satire, it's the main point
    There is thought-stopping satire and thought-provoking satire. Much of it depends on the context. I’m not getting the latter from a “USA land of the ‘free’” comment.
    
    suraci 9 months ago
    
    > is collecting rain water illegal?
    > It depends on where you live. In many places, collecting rainwater is completely legal and even encouraged, but some regions have regulations or restrictions.
    United States: Most states allow rainwater collection, but some have restrictions on how much you can collect or how it can be used. For example, Colorado has limits on the amount of rainwater homeowners can store. Australia: Generally legal and encouraged, with many homes using rainwater tanks. UK & Canada: Legal with few restrictions. India & Many Other Countries: Often encouraged due to water scarcity.
    
    bloopernova 9 months ago
    
    That takes me back! Fark.com would delete any comment that contained random hexadecimal.
    
    KPGv2 9 months ago
    
    It was the beginning of the end for Digg, too, IIRC. Started a lot of people leaving for Reddit, right?
    
    bloopernova 9 months ago
    
    I think so; I joined Reddit when it was in tech news as people left Digg after the big redesign. I'm not sure when the exodus started. I left Fark over the hd-dvd mess.
    
    KPGv2 9 months ago
    
    > whose publication or inappropriate possession is claimed to be illegal in the United States.
    That's not the same thing as a number being illegal at all. Here, watch this:
    > I claim breathing is illegal in the United States
    There, now breathing is claimed to be illegal in the United States.
    
    I-M-S 9 months ago
    
    In both cases, legality depends entirely on repercussions, i.e. if there's someone to enforce the ban. I suspect that in the "illegal numbers" case there might be.
    
    vluft 9 months ago
    
    man that's very concerning for wikipedia who is publishing it right there on the page linked above.
    
    dylan604 9 months ago
    
    Only concerning if they are a US based company hosting their data in US data centers. oops
  - shafyy 9 months ago
    
    It's not open source. The provide the model and the weights, but not the source code and, crucially, the training data. As long as LLM makers don't provide the training data (and they never will, because then they will be admitting to stealing), LLMs are never going to be open source.
    
    sho_hn 9 months ago
    
    Thanks for reminding people of this.
    Open source means two things in spirit:
    (a) You have everything you need to be able to re-create something, and at any step of the process change it.
    (b) You have broad permissions how to put the result to use.
    The "open source" models from both Meta so far fail either both or one of these checks (Meta's fails both). We should resist the dilution of the term open source to the point where it means nothing useful.
    
    jprete 9 months ago
    
    I think people are looking for the term "freeware" although the connotations don't match.
    
    sho_hn 9 months ago
    
    Agreed, but the "connotations don't match" is mostly because the folks who chose to call it open source wanted the marketing benefits of doing so. Otherwise it'd match pretty well.
    
    KPGv2 9 months ago
    
    At the risk of being called rms, no, that's not what open source means. Open source just means you have access to the source code. Which you do. Code that is open source but restrictively licensed is still open source.
    That's why terms like "libre" were born to describe certain kinds of software. And that's what you're describing.
    This is a debate that started, like, twenty years ago or something when we started getting big code projects that were open source but encumbered by patents so that they couldn't be redistributed, but could still be read and modified for internal use.
    
    jefftk 9 months ago
    
    > Open source just means you have access to the source code.
    That's https://en.wikipedia.org/wiki/Source-available_software , not 'open source'. The latter was specifically coined [1] as a way to talk about "free software" (with its freedom connotations) without the price connotations:
    The argument was as follows: those new to the term "free software" assume it is referring to the price. Oldtimers must then launch into an explanation, usually given as follows: "We mean free as in freedom, not free as in beer." At this point, a discussion on software has turned into one about the price of an alcoholic beverage. The problem was not that explaining the meaning is impossible—the problem was that the name for an important idea should not be so confusing to newcomers. A clearer term was needed. No political issues were raised regarding the free software term; the issue was its lack of clarity to those new to the concept.
    [1] https://opensource.com/article/18/2/coining-term-open-source...
    
    HDThoreaun 9 months ago
    
    You dont get to redefine what "open" means.
    
    jefftk 9 months ago
    
    It's common for terms to have a more specific meaning when combined with other terms. "Open source" has had a specific meaning now for decades, which goes beyond "you can see the source" to, among other things, "you're allowed to it without restriction".
    
    RobotToaster 9 months ago
    
    So Swedish meatballs are any ball of meat made in Sweden?
    And French fries are anything that was fried in France?
    
    davidcbc 9 months ago
    
    Tell that to Sam Altman
    
    esafak 9 months ago
    
    He did not succeed, did he?
    
    dTal 9 months ago
    
    I don't know why you've been downvoted. This is a 100% correct history. "Open source" was specifically coined as a synonym to "free software", and has always been used that way.
    
    sho_hn 9 months ago
    
    > Open source just means you have access to the source code. Which you do.
    No, they also fail even that test. Neither Meta nor DeepSeek have released the source code of their training pipeline or anything like that. There's very little literal "source code" in any of these releases at all.
    What you can get from them is the model weights, which for the purpose of this discussion, is very similar to compiler binary executable output you cannot easily reverse, which is what open source seeks to address. In the case of Meta, this comes with additional usage limitations on how you may put them to use.
    As a sibling comment said, this is basically "freeware" (with asterisks) but has nothing to do with open source, either according to RMS or OSI.
    > This is a debate that started, like, twenty years ago
    For the record, I do appreciate the distinction. This isn't meant as an argument from authority at all, but I've been an active open source (and free software) developer for close to those 20 years, am on the board of one of the larger FOSS orgs, and most households have a few copies of FOSS code I've written running. It's also why I care! :-)
    
    nuancebydefault 9 months ago
    
    The weights, which are part of the source, are open. Now you are arguing it not being open source because they don't provide the source for that part of the source. If you follow that reasoning you can ad infinitum claim the absence of sources since every source originates from something.
    
    Kerbonut 9 months ago
    
    The source is the training data and the code used to turn the training data _into_ the weights. Thus GP is correct, the weights are more akin to a binary from a traditional compiler.
    
    nuancebydefault 9 months ago
    
    To me this 'source' requirement does not make sense. It is not that you bring training data and the application together and press a train button, there's much more actions involved.
    Also the training data is of a massive amount.
    Additionally, what about human in the loop training, do you deliver humans as part of the source?
    
    JumpCrisscross 9 months ago
    
    > they also fail even that test. Neither Meta nor DeepSeek have released the source code of the
    This debate is over and makes the open source community look silly. Open model and weights is, practically speaking, open source for LLMs.
    I have tremendous respect for FOSS and those who build and maintain it. But arguing for open training data means only toy models can practically exist. As a result, the practical definition will prevail. And if the only people putting forward a practical definition are Meta et al, this is what you get: source available.
    
    sho_hn 9 months ago
    
    I'm not arguing for open training data BTW, and the problem is exactly this sort of myopic focus on the concerns of the AI community and the benefits of open-washing marketing.
    Completely, fully breaking the meaning of the term "open source" is causing collateral damage outside the AI topic, that's where it really hurts. The open source principle is still useful and necessary, and we need words to communicate about it and raise correct expectations and apply correct standards. As a dev you very likely don't want to live in a tech environment where we regress on this.
    It's not "source available" either. There's no source. It's freeware.
    "I can download it and run it" isn't open source.
    I'm actually not too worried that people won't eventually re-discover the same needs that open source originally discovered, but it's pretty lame if we lose a whole bunch of time and effort to re-learn some lessons yet again.
    
    JumpCrisscross 9 months ago
    
    > it's pretty lame if we lose a whole bunch of time and effort to re-learn some lessons yet again
    We need to relearn because we need a different definition for LLMs. One that works in practice, not just at the peripheries.
    Maybe we can have FOSS LLMs vs open-source ones, like we do with software licenses. The former refers to the hardcore definition. The latter the practical (and widely used) one.
    
    sho_hn 9 months ago
    
    Sure, I don't disagree. I fully understand the open-weights folks looking for a word to communicate their approach and its benefits, and I support them in doing so. It's just a shame they picked this one in - and that's giving folks a lot of benefit of the doubt - a snap judgement.
    > Maybe we can have FOSS LLMs vs open-source ones, like we do with software licenses.
    Why not just call them freeware LLMs, which would be much more accurate?
    There's nothing "hardcore" or "zealot" about not calling these open source LLMs because there's just ... absolutely nothing there that you call open source in any way. We don't call any other freeware "open source" for being a free download with a limited use license.
    This is just "we chose a word to communicate we are different from the other guys". In games, they chose to call it "free to play (f2p)" when addressing a similar issue (but it's also not a great fit since f2p games usually have a server dependency).
    
    JumpCrisscross 9 months ago
    
    > Why not just call them freeware LLMs, which would be much more accurate?
    Most of the public is unfamiliar with the term. And with some of the FOSS community arguing for open training data, it was easy to overrule them and take the term.
    
    sho_hn 9 months ago
    
    Most of the public is also unfamiliar with the term open source, and I'm not sure they did themselves any favors by picking one that invites far more questions and needs for explanation. In that sense, it may have accomplished little but its harmful effects.
    I get your overall take is "this is just how things go in language", but you can escalate that non-caring perspective all the way to entropy and the heat death of the universe, and I guess I prefer being an element that creates some structure in things, however fleeting.
    
    JumpCrisscross 9 months ago
    
    > Most of the public is also unfamiliar with the term open source
    I’d argue otherwise. (Familiar with, not know.) Particularly in policy circles.
    > picking one that invites far more questions and needs for explanation
    There wasn't ever a debate. And now, not even the OSI demands training data. (It couldn’t. It, too, would be ignored.)
    
    Flimm 9 months ago
    
    The only practical and widely used definition of open source is the one known as the Open Source Definition published by the OSI.
    The set of free/libre licenses (as defined by the FSF) is almost identical to the set of open sources licenses (as defined by the OSI).
    The debate within FOSS communities has been between copyleft licenses like the GPL, and permissive licenses like the MIT licence. Both copyleft and permissive licenses are considered free/libre by the FSF, and both of them are considered open source by the OSI.
    
    HDThoreaun 9 months ago
    
    Open source means the source code is freely available. It’s in the name.
    
    idle_zealot 9 months ago
    
    The source being available means the code is "source available." Open implies more rights.
    
    coliveira 9 months ago
    
    People say this, but when it comes to AI models, the training data is not owned by these companies/groups, so it cannot be "open sourced" in any sense. And the training code is basically accessing that training data that cannot be open sourced, therefore it also cannot be shared. So the full open source model you wish to have can only provide subpar results.
    
    sheepdestroyer 9 months ago
    
    They could easily list the data used though. These datasets are mostly known and floating around. When they are constructed, instructions for replication could be provided too
    
    coliveira 9 months ago
    
    They could, but even if they give this list the detractors will still say it is not open source.
    
    rvnx 9 months ago
    
    yes and as a bonus they may get sued, which in the long-term, makes free / offline models to not be viable
    It would be so much better if all models were trained with LibGen.
    
    Timon3 9 months ago
    
    Isn't this the same situation that any codebase faces when one thinks about open sourcing it? I can't legally open source the code I don't own.
    
    beAbU 9 months ago
    
    Thanks, I was not aware of this distinction.
    But I think my argument still stands though? Users can run Deepseek locally, so unless the US Gov't wants to reach for book burning levels or idiocy, there is not really a feasible way to ban the American public of running DeepSeek, no?
    
    shafyy 9 months ago
    
    Yes, your argument still stands. But I think it's important to stand firm that the term "open source" is not a good label for what these "freeware" LLMs are.
    
    beAbU 9 months ago
    
    Fair point, agreed.
  - superkuh 9 months ago
    
    There was an executive order passed by the previous administration that make using anything with more than 10 billion parameters illegal and punishable by government force if done without authorization. Of course like most government regulations (even though this is not a regulation, it is an executive action) the point is not to stop the behavior but instead to create a system where everyone breaks the regulation constantly so that if anyone rocks the boat they can be indicted/charged and dealt with.
    https://www.federalregister.gov/documents/2023/11/01/2023-24...
    >(k) The term “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by: ...
    
    ceejayoz 9 months ago
    
    That order does not "make using anything with more than 10 billion parameters illegal and punishable by government force if done without authorization".
    It orders the Secretary of Commerce to "solicit input from the private sector, academia, civil society, and other stakeholders through a public consultation process on potential risks, benefits, other implications, and appropriate policy and regulatory approaches related to dual-use foundation models for which the model weights are widely available".
    
    derektank 9 months ago
    
    Many regulations are created by executive action, without input from Congress. The Council on Environmental Quality, created by the National Environmental Policy Act, has the power to issue it's own regulations. Executive Orders can function similarly and the executive can order rulemaking bodies to create and remove regulations, though there is a judicial effort to restrict this kind of policymaking and return regulatory power back to Congress.
    
    Spooky23 9 months ago
    
    There’s an effort to restrict certain regulatory rule-making where it’s ideologically convenient, but it isn’t “returning” regulatory power. That rulemaking authority isn’t derived by some bullshit executive order, but by Federal law, as implemented by congress.
    Congress has never ceded power to anyone. They wield legislative authority and power of the purse, and wield it as they see fit. The special interests campaigning about this are extreme reactionaries whose stated purpose is to make government ineffective.
  - bilekas 9 months ago
    
    If I'm no wrong wasn't PGP encryption once illegal to export ? Not quite the same but the government has a nice habit of feeling like they can bad the export of research.
    https://en.wikipedia.org/wiki/Export_of_cryptography_from_th...
    
    Prbeek 9 months ago
    
    Add PS1 too. The US government banned sale of PlayStation to China because the PLA would apparently have access to cutting edge chips for their missiles
    
    beAbU 9 months ago
    
    You are right, but I cannot find a single example of such a ban actually being effective though. Information wants to be free and all that.
    
    KPGv2 9 months ago
    
    Because you haven't heard of the proprietary software that wasn't ever sold internationally because of these bans.
    Of course Joe Sixpack can throw their code up anywhere, but Joe Corporation gets wrecked if they try to sell it.
    https://developer.apple.com/documentation/security/complying...
    For example, this is enforced by Apple Store.
    
    coliveira 9 months ago
    
    But that's not the goal, the goal is to protect the "intelectual property" only to American companies. Countries not in the "friends list" cannot sell products in that area without suffering repercussions. That's how the US has maintained technological dominance in some areas by restricting what other countries can do.
    
    calgoo 9 months ago
    
    If i remember correctly, if you changed the dropdown on the webpage to USA you could download the full version of PGP anyway.
  - michaelt 9 months ago
    
    Make commercial hosting illegal, and make the hardware to run it locally cost $6000+
  - Drakim 9 months ago
    
    They banned certain branches of math during the cold war, it can be done.
    
    jerry80 9 months ago
    
    Such as?
    
    Drakim 9 months ago
    
    All non-trivial encryption algorithms.
    https://en.wikipedia.org/wiki/Crypto_Wars
- semking 9 months ago
  
  I never said they are just a clone! There's an actual tech breakthrough!
  Read the two following sections of my blog post:
  1. "Distilled language models"
  2. "DeepSeek: Less supervision"
schmit 9 months ago

Even more hilarious given their own charter:
> We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.
> Our primary fiduciary duty is to humanity. We anticipate needing to marshal substantial resources to fulfill our mission, but will always diligently act to minimize conflicts of interest among our employees and stakeholders that could compromise broad benefit.
> We will actively cooperate with other research and policy institutions; we seek to create a global community working together to address AGI’s global challenges.
- semking 9 months ago
  
  Ah yes: "duty to humanity"
  - hn_throwaway_99 9 months ago
    
    I think one good thing to come out of all this tech elite flip flopping is that I now see these tech leaders for exactly who they are. It makes me kind of sad, because as someone who came of age early in the Web era I really wanted to believe that there was a bigger moral good to all we were doing.
    I now view any moralistic statement by any of these big tech companies as complete and total bullshit, which is probably for the best, because that is what it is. These companies now exist solely to amass power and wealth. They will still use moralistic language to try to motivate their employees, but I hope folks still see it for the complete nonsense that it is.
gruez 9 months ago

The picture at the end showing deepseek's privacy policy and being concerned that it's "a security risk" is hilarious[1]. Basically every B2C company collects this sort of information[2], and is far less intrusive than what social networks collect[3]. But because it's Chinese and at the risk of overtaking Western companies, people are suddenly worried about device information and IP addresses?
[1] https://semking.com/wp-content/uploads/2025/01/DeepSeek-1024...
[2] https://www.bestbuy.com/site/help-topics/privacy-policy/pcmc...
[3] https://www.facebook.com/privacy/policy/
- semking 9 months ago
  
  One of my core followers named Bruno basically said the same thing under my Linkedin post yesterday:
  https://www.linkedin.com/posts/organic-growth_deepseek-the-o...
  I welcome friction, so I'll be blunt: I disagree with you, not because what you are saying is wrong but because you only consider systematic data collection.
  That's not the issue here.
  There's a difference between democracies like the United States or European countries, no matter how IMPERFECT they are, and a dictatorship that does not allow dissenting opinions.
  There's a difference in how the data collected will be used.
  Freedom of speech, even when it is relative, is better than totalitarianism.
  - ziddoap 9 months ago
    
    >There's a difference in how the data collected will be used.
    Not that we could ever see what the NSA, CISA, ASIS, GCHQ, and other 3/4-letter agencies are actually doing with the collected data.
    But they pinky promised to use it properly (or something), so, yay.
  - ryanobjc 9 months ago
    
    It’s also important to recognize that the Chinese government is known to walk into internet service companies and demand they censor, alter data, delete things. No court order or search warrant required.
    China considers industry to be completely subservient to government. Checks and balances are secondary to ideas like harmony and collective well being.
    
    semking 9 months ago
    
    Thank you for this balanced and essential comment which is entirely true!
  - gruez 9 months ago
    
    >There's a difference between democracies like the United States or European countries, no matter how IMPERFECT they are, and a dictatorship that does not allow dissenting opinions.
    >There's a difference in how the data collected will be used.
    >Freedom of speech, even when it is relative, is better than totalitarianism.
    I don't disagree with "democracy is better than totalitarianism", but what does that have to do with collecting device information and IP addresses? Is that excuse a cudgel you can use against any behavior that would otherwise be innocuous? It's fine to be against deepseek because you're concerned about them getting sensitive data via queries, or even that their models be a backdoor to project chinese soft power, but hand wringing about device information and IP addresses is absurd. It makes as much sense as being concerned that the CCP/deepseek does meetings, because even though every other companies does meetings, CCP/deepseek meetings could be used for totalitarianism.
    
    coliveira 9 months ago
    
    Also, the same people that complain about this are just fine with a western government having access to the same data via big corporations. Why being democratic gives you a free access card to disregard privacy, in other words, doing exactly the opposite of what is expected from a free society?
    
    semking 9 months ago
    
    I don't disagree with you either and like you, I'm entirely against privacy violations in any way, shape or form.
    I admit I am concerned when I see blatant algorithmic manipulation of social platforms to favor any narrative that aligns with geopolitical objectives.
    I also wrote about the TikTok algo a few days ago. You'll see what I think of user privacy violations (closed ecosystem + basically a keylogger in this case):
    https://semking.com/likes-lies-untold-story-tiktok-algorithm...
    I cannot stand when dissenting voices or opinions are shadow-banned.
    And I have the same opinion regarding U.S. or EU companies.
    Our privacy should be respected.
    In the meantime: strong encryption at every corner, please!
    
    gruez 9 months ago
    
    >I'm entirely against privacy violations in any way, shape or form.
    >Our privacy should be respected.
    Characterizing device information and IP addresses as "privacy violations" is a stretch. If you showed a history railing against this sort of stuff, agnostic of geopolitical alignment, then you get a pass, but I think it's fair to assume the converse until proven otherwise.
    >In the meantime: strong encryption at every corner, please!
    Irrelevant. The data collection is done by first parties. Encryption doesn't do anything.
    >I admit I am concerned when I see blatant algorithmic manipulation of social platforms to favor any narrative that aligns with geopolitical objectives.
    >I cannot stand when dissenting voices or opinions are shadow-banned.
    What does this have to do with privacy? Again, it's fine to be against "blatant algorithmic manipulation of social platforms" or whatever, but dragging seemingly unrelated topics in an attempt to amass as big pile of greviances as possible is disingenuous.
    >I also wrote about the TikTok algo a few days ago. You'll see what I think of user privacy violations (closed ecosystem + basically a keylogger in this case):
    >https://semking.com/likes-lies-untold-story-tiktok-algorithm...
    Where's the keylogging? I skimmed the article and the only thing I could find was a passing mention about an article that you "was advised not to publish it and I didn’t". How much keylogging could possibly going on in a short video app? Is the "keylogging" just a way to make "we measure how engaged someone is with a video" as sinister as possible?
    
    semking 9 months ago
    
    >Characterizing device information and IP addresses as "privacy violations" is a stretch.
    I agree: this is a characterization I never made. FYI, I also collect this type of data about you when you visit my website. That said, telemetry + totalitarianism = bad combo.
    >Irrelevant. The data collection is done by first parties. Encryption doesn't do anything.
    Even if data is collected by first parties, encryption is still highly relevant because it ensures that the data remains secure in transit and at rest. It does a lot.
    >What does this have to do with privacy? Again, it's fine to be against "blatant algorithmic manipulation of social platforms" or whatever, but dragging seemingly unrelated topics in an attempt to amass as big pile of greviances as possible is disingenuous.
    You are aggressive for no reason whatsoever. There's nothing disingenuous: when users are shadow-banned by platforms under dictatorships, they end up flagged, and their private data is often analyzed for nefarious reasons. There's a link with privacy but I'll stop at this stage if we cannot have a civilized discussion.
    >Where's the keylogging? I skimmed the article and the only thing I could find was a passing mention about an article that you "was advised not to publish it and I didn’t". How much keylogging could possibly going on in a short video app? Is the "keylogging" just a way to make "we measure how engaged someone is with a video" as sinister as possible?
    “TikTok iOS subscribes to every keystroke (text inputs) happening on third party websites rendered inside the TikTok app. This can include passwords, credit card information and other sensitive user data. (keypress and keydown). We can’t know what TikTok uses the subscription for, but from a technical perspective, this is the equivalent of installing a keylogger on third party websites.”
    https://krausefx.com/blog/announcing-inappbrowsercom-see-wha...
    Please note that this article is outdated (August 2022). Importantly, the article does not claim that any data logging or transmission is actively occurring. Instead, it highlights the potential technical capabilities of in-app browsers to inject JavaScript code, which could theoretically be used to monitor user interactions.
    
    pphysch 9 months ago
    
    > I admit I am concerned when I see blatant algorithmic manipulation of social platforms to favor any narrative that aligns with geopolitical objectives.
    I'm curious how robust this principle is for you, because China and Russia are not the first countries that come to mind when talking about the (actual, existing, documented) manipulation of US speech and media by a foreign government.
    Yet it seems we can only have this discussion, ironically, when the subject is a US government-approved one like China. Anything else would be problematic and unsafe.
    
    semking 9 months ago
    
    I don't want to get into politics but I'll gladly admit human beings are biased.
    "We Don't See Things As They Are, We See Them As We Are"
    — Samuel b. Nahmani
  - r00fus 9 months ago
    
    Amusing Bruno seems to think in terms of labels when the reality is that the USA imprisons far more people per capita, and blatantly disregards its so-called "core freedoms" (ie, Bill of Rights) for its citizens very often.
    This kind of person has a lot of cognitive dissonance going on.
pen2l 9 months ago

While all of this is true, that DeepSeek wouldn't be here were it not for the research that preceded it notably Google's paper, then Llama, and ChatGPT which they're modeled after, its release still did something profound to their psyche, the motivation and self-actualization this instills to the Chinese. They witnessed the power of their accomplishments: a side-hustle project knocked off an easy trillion. This is only egging them on and will serve to ramp up their efforts even more.
Separately, I do think that now that the Chinese leadership saw this, that they have the chops to pull this off and then some, they are probably going to rein in future innovations; they'll likely demand that the big future discoveries remain closed-sourced (or even unannounced/unpublicized).
- tedivm 9 months ago
  
  OpenAI wouldn't be here without the work that Yann Lecun did at Facebook (back when it was facebook). Science is built on top of science, that's just how things work.
  - wrasee 9 months ago
    
    Yes, but in science you reference your work and credit those who came before you.
    Edit: I am not defending OpenAI and we are all enjoying the irony here. But it puts into perspective some of the wilder claims circulating that DeekSeek was able to somehow complete with OpenAI for only $5M, as if on a level playing field.
    
    tedivm 9 months ago
    
    OpenAI has been hiding their datasets, and certainly haven't credited me for the data they stole from my website and github repositories. If OpenAI doesn't think they should give attribution to the data they used, it seems weird to require that of others.
    Edit: Responding to your edit, Deepseek only claimed that the final training run was $5m, not that the whole process caught that (they even call this out). I think it's important to acknowledge that, even if they did get some training data from OpenAI, this is a remarkable achievement.
    
    wrasee 9 months ago
    
    It is a remarkable achievement. But if “some training data from OpenAI” turns out to essentially be a wholesale distillation of their entire model (along with Llama etc) I do think that somewhat dampens the spirit of it.
    We don’t know that of course. OpenAI claim to have some evidence and I guess we’ll just have to wait and see how this plays out.
    There’s also a substantial difference between training of the entire internet and one that very specifically targets your competitor's products (or any specific work directly).
    
    ambicapter 9 months ago
    
    Only weird if you think what OpenAI did should be the norm.
    
    wrasee 9 months ago
    
    Right. I think many here are enjoying the Schadenfreude against OpenAI, but that hardly makes it right. It just makes it a race to the bottom.
    
    bugglebeetle 9 months ago
    
    Like all those papers with their long lists of citations OpenAI has been releasing?
    
    dkjaudyeqooe 9 months ago
    
    That's only in academia. The same thing happens in commerce, only there is no (official) credit given.
    
    Filligree 9 months ago
    
    That's $5M for the final training run. Which is an improvement to be sure, but it doesn't include the other training runs -- prototypes, failed runs and so forth.
    
    coliveira 9 months ago
    
    It is OpenAI that discredits themselves when they say that each new model is the result of hundreds of USD millions in training. They throw this around as it is a big advantage of their models.
    
    nicce 9 months ago
    
    And the cost is based on the imaginary currency that Microsoft has given for them as Azure computing.
  - blackeyeblitzar 9 months ago
    
    Is that really true? If anything OpenAI was dependent on the transformers paper from Google from Ashish Vaswani and others. LeCun has been criticizing LLM architectures for a long time and has been wrong about them for a long time.
    
    mv4 9 months ago
    
    That was my impression too. He is considered the inventor of CNN back in 1998. Is there anything more recent that's meaningful?
    
    tedivm 9 months ago
    
    I was more referring to this paper from 2015:
    https://scholar.google.com/citations?view_op=view_citation&h...
    Basically all LLM can trace their origin back to that paper.
    This was just a single example though. The whole point is that people build on the work from the past, and that this is normal.
    
    esafak 9 months ago
    
    That's just an overview for paper for those new to the field. The transformer architecture has a better claim to being the origin of LLMs.
    
    mv4 9 months ago
    
    Thank you for sharing this.
    
    blackeyeblitzar 9 months ago
    
    Personally, I have not seen anything from him that is meaningful. OpenAI and Anthropic (itself started by former OpenAI people) of course have built their models without LeCun’s contributions. And for a few years now, LeCun has been giving the same talk anywhere he makes appearances, saying that large language models are a dead end and that other approaches like his JEPA architecture are the future. Meanwhile current LLM architecture has continued to evolve and become very useful. As for the misuse of the term “open source”, I think that really began once he was at Meta, and is a way to use his fame to market Llama and help Meta not look irrelevant.
    
    tedivm 9 months ago
    
    They literally cited LeCun in their GPT papers.
    
    amelius 9 months ago
    
    By the way, as someone who once did classical image recognition using convolutions, I can't say I was very impressed by the CNN approach, especially since their implementation didn't even use FFTs for efficiency.
  - zbendefy 9 months ago
    
    Also without the "attention is all you need" paper from google
- nicce 9 months ago
  
  We wouldn't be here discussing if nobody invented internet... nor these models had training data at all.
  > Separately, I do think that now that the Chinese leadership saw this, that they have the chops to pull this off and then some, they are probably going to rein in future innovations; they'll likely demand that the big future discoveries remain closed-sourced (or even unannounced/unpublicized).
  How do we know that this is not already happening with OpenAI/Meta and the U.S. government at some level? The concept of power is equal, whether we wanted it or not. We don't have to pretend to be "better" all the time.
- openrisk 9 months ago
  
  > they'll likely demand that the big future discoveries remain closed-sourced
  Depends on whether they want these tools to be adopted in the wider world. Rightly or wrongly there is a lot of suspicion in the West and an open source approach builds trust.
- hn_throwaway_99 9 months ago
  
  > While all of this is true, that DeepSeek wouldn't be here were it not for the research that preceded it (notably Llama), and ChatGPT which they're modeled after...
  If the allegation is true (we don't know yet), then what you've written perfectly proves the point everyone is making. ChatGPT wouldn't be here if it weren't for all the research and work that preceded it in terms of tons of scrapable content being available on the Internet, and it's not like OpenAI invented transformers either.
  Nobody is accusing DeepSeek of hacking into OpenAI's systems and stealing their content. OpenAI is just saying they scraped them in an "unauthorized" manner. The hypocrisy is laughably striking, but sadly nobody has any shame anymore in this world it seems. Play me the world's tiniest violin for OpenAI.
- dismalaf 9 months ago
  
  Don't forget all the research that came before OpenAI and ChatGPT...
- stravant 9 months ago
  
  Yes, and what does preceding research do? Get followed by more research building on it.
  - dylan604 9 months ago
    
    Standing on the shoulders and it's turtles all the way
scotty79 9 months ago

"That's hilarious!" was my first reaction as well, when I heard about it the first time. When I came to HN and saw this story on top I was hoping this was the top comment. I was not disappointed.
US AI folk were leading for two years by just throwing more and more compute at the same thing that Google threw them like a bone years ago (namely transformers). They made next to no innovation in any area other than how to connect more compute together. The idea of additional inference time compute, looping the network back on its own outputs, which is the only significant conceptual advancement of last years was something I, as a layman, came up with after few days of thinking why AI sucks and what can be done to make it able to tackle problems that require iterative reasoning. They announced it few weeks after I came up with the idea, so it was in the works for some time, but it shows you how basic idea it was. There was nothing else.
Suddenly when there comes a small company that introduced few actual algorithmic advancements which resulted in 100x optimization which is something expected with algorithmic optimizations, the big AI suddenly went into full "dog ate my homework" mode. Blaming everyone and everything around.
Let's not mention the fact that if full outputs of their models could enable them to train a better model at 1% cost then it puts them in even worse light that they didn't do it.
- ryanobjc 9 months ago
  
  It’s not often you get 100x optimization with some small improvements so I’m kind of skeptical.
  We have and apples and oranges thing here which deepseek is intentionally leaning into. They get very cheap electricity and are bragging about their cheap cost, and OpenAI etc typically brag about how expensive their training is. But it’s all pr and lies.
  - enragedcacti 9 months ago
    
    > They get very cheap electricity and are bragging about their cheap cost
    The cost of $5.5 million was quoted at $2/GPU-hour which is a reasonable price for on-demand H100s that anyone in the US could access, and likely on the high side given bulk pricing and that they are using nerfed versions. OpenAI might be all pr and lies but everything I've seen so far says that deepseek's claims about cost are legit.
TypingOutBugs 9 months ago

Screw OpenAI, they scrape us without issues so someone scraped them. No issues with this.
- coliveira 9 months ago
  
  But the government will now claim this is against "national security". Only American companies are allowed to commit this kind of "sleight of hand".
  - Imustaskforhelp 9 months ago
    
    Yes they would. But it would pointless. And clear hypocrisy as well.
    
    coliveira 9 months ago
    
    Hypocrisy or not, the US government has managed to make this work for a long time now, the Biden administration just proves the point. Thankfully, other countries are starting to catch up to this scam.
    
    Imustaskforhelp 9 months ago
    
    Yes , to be fair , As a foreigner (not a US citizen basically) I don't mean to offend somebody. But USA just seems to be build on top of Hypocrisy.
    Like the fact that US revolution was basically kickstarted by blatantly breaking the patent law (like there was this one mill specifically) , I think its a historic event. And now here we are ! The scam of national security.
    To be honest. People seem to be really kind on the fall of USA. I am not that interested since the rise of China terrifies me. But the hypocrisy of USA / losing such soft power (like here I am , from random country critiquing USA based on facts , it really downplays it being a superpower) that would be the downfall of USA.
    To me , the future terrifies me. In fact the present terrifies me. I think the world is running crazy or maybe its just me.
    
    bayindirh 9 months ago
    
    > Like the fact that US revolution was basically kickstarted by blatantly breaking the patent law...
    Hollywood also started by using non-regulation / non-licensed movie equipment when nobody was looking.
    So, USA has all this "move fast, break things, and monopolize the new thing so hard that no one can get near" mentality since forever, and this moves in cycles.
    It's now AIs turn, but it turned out that they democratized the world so hard, so everybody can act fast now.
    In nature, nobody can stay at the top forever. People should understand this.
pilooch 9 months ago

Any ML based service with an API is basically a dataset builder for more ML. This has been known forever and is actually a useful "law" of ML-based systems.
- sho_hn 9 months ago
  
  Aye, this should be obvious even to non-technical folks. Much has been written about how LLMs regurgitate the data they were trained on. So if you're looking for data to train on, you can certainly extract it there.
  Plus of course for people within the tech bubble, plenty of research results on the value of synthetically augmented and expanded training data that put the impact past just regurgitating source data.
  This whole episode is a failure of reporting what to expect next and projecting running costs etc. most of all.
- amelius 9 months ago
  
  This is why models should be open. Or at least they should have a local option.
coliveira 9 months ago

They really lost their minds. They're all scared and worried because companies in other countries can also access the same data they stole from the Internet.
okdood64 9 months ago

Not to mention the total dodge when Murati was asked about training on the YouTube corpus during that television interview.
Sorry for the Short: https://www.youtube.com/shorts/M0QyOp7zqcY
rubslopes 9 months ago

> Our mission is to ensure that artificial general intelligence benefits all of humanity.[1]
Well, I guess they really helped make this a reality!
[1] https://openai.com/about/
radicality 9 months ago

I liked Matt Levine’s newsletter few days ago where he hypothesized scenarios where it’s much more profitable to short your competitors, then release a much better version of some widget completely free, and then profit $$$. Which is plausible here too, considering DeepSeek is made by a hedge fund.
- freehorse 9 months ago
  
  How would that work out here though? "Open"AI is not publicly traded. Any kind of shorting would be quite indirect.
- greasegum 9 months ago
  
  Came here to mention this too. Seem almost so obvious that I'm surprised this isn't the dominant angle.
Leary 9 months ago

Does this mean when you use OpenAI as an enterprise customer, they can see exactly the queries and answers? So much for privacy!
skeeter2020 9 months ago

I share the sentiment here, but asking as a noob: does this mean the performance comparison is not really apples to apples? If it required the distillation of the expensive model in order to get such good results for a much lower price, is that shady accounting?
belter 9 months ago

So it is true, they run out of Data to steal? :-)
And then where DeepSeek steal from next? Do they steal from themselves? Do they steal the stolen models they stole from the stolen data?
The AI Ponzi scheme...
- troyvit 9 months ago
  
  Exactly this, especially as journalism melts down into slag. Soon all anybody will have to train on is social media, Wikipedia and GitHub, and that last one will slowly be metastasized by AI-generated code anyway.
  It reminds me of 1984 in a sense. "Don’t you see that the whole aim of Newspeak is to narrow the range of thought? In the end we shall make thoughtcrime literally impossible, because there will be no words in which to express it."
  Unlike 1984 I don't see this winnowing of new concepts as purposeful, but on the other hand I keep asking myself how we can be so stupid as to keep doing it.
mritchie712 9 months ago

openai should pay creators, but:
1. scraping the internet and making AI out of it
2. using the AI from #1 to create another AI
are not the same thing.
- Palmik 9 months ago
  
  I agree, (2) seems much less problematic since the AI outputs are not copyrightable and since OpenAI gives up ownership of the outputs. [1]
  So, if you really really care about ToS, then just never enter into a contract with OpenAI. Company A uses OpenAI to generate data and posts it on the open Internet. Company B scrapes open Internet, including the data from Company A [2].
  [1]: Ownership of content. As between you and OpenAI, and to the extent permitted by applicable law, you (a) retain your ownership rights in Input and (b) own the Output. We hereby assign to you all our right, title, and interest, if any, in and to Output.
  [2]: This is not hypothetical. When ChatGPT got first released, several big AI labs accidentally and not so accidentally trained on the contents of the ShareGPT website (site that was made for sharing ChatGPT outputs). ;)
- epse 9 months ago
  
  #1 destroys peoples willingness to publish and unfairly hogs bandwidth / creates costs for small hosters
  #2 makes a big corp a bit angry
  Indeed not the same thing
- haswell 9 months ago
  
  Yes, they are different actions.
  But arguably these actions share enough characteristics that it’s reasonable to place them in the same category. Something like: “products that exist largely/solely because of the work of other people”. The nonconsensual nature of this and the lack of compensation is what people understandably take issue with.
  There is enough similarity that it evokes specific feelings about OpenAI when they suddenly find themselves on the other side of the situation.
- zbshqoa 9 months ago
  
  Number 2 is already possible with open models. You can do distillation using Llama, which could likely be doing #1 to build their models (I'm not sure it's the case though)
- Winsaucerer 9 months ago
  
  I'm genuinely not sure which one you think is worse (if any). (1) seems worse, but your reply suggests to me maybe you think (2) is worse.
  - meowface 9 months ago
    
    Not that poster, but I think both are equally fine.
    It's funny if OpenAI were to complain about this, but at least on Twitter I don't see that much whining about it from OpenAI employees. Sam publicly praised DeepSeek.
    I do see some of them spreading the "they're hiding GPUs they got through sanction evasion" theory, which is disappointing, though.
- jillyboel 9 months ago
  
  You're right, (1) is violating the rights of a large portion of the population, (2) is violating the rights of one company
- latexr 9 months ago
  
  > are not the same thing.
  You’re right. The second one is far more ethical. Especially when stealing from a thief.
  Doesn’t Sam Altman keep parroting they’re developing AI “for the good of humanity”? Well then, someone taking their model and improving on it, making it open-source, having it consume less, and having a cheaper API, should make him delighted. Unless he *gasp* was full of shit the whole time. Who could have guessed?
  - perryizgr8 9 months ago
    
    > Doesn’t Sam Altman keep parroting they’re developing AI “for the good of humanity”?
    “I don't want to live in a world where someone else makes the world a better place better than we do”
    - Gavin Belson
- tw1984 9 months ago
  
  #1 is stealing from all average joes ever lived on earth
  #2 is taking advantages from closedAI.
  they are indeed different
- sksrbWgbfK 9 months ago
  
  > 2. using the AI from #1 to create another AI
  2. scraping the AI from #1 and making AI out of it
- bugglebeetle 9 months ago
  
  Yeah, #1 is way worse and #2 falls under “turnabout is fair play.”
rvz 9 months ago

They have been out-grifted by DeepSeek and OpenAI is not happy about someone out-shining them on that.
The best part is "their IP" was humanity's scraped content and they are angry that DeepSeek did their job for them and gave it away for free.
adzm 9 months ago

Why does this post use DeepSink instead of DeepSeek at apparently random places? Is that just a pejorative pun like ClosedAI?
the_arun 9 months ago

I think the point is - OpenAI scraped public data - d1 - Trained their model to produce output - d2 - DeepSeek used d2 to reinforce their model
OpenAI is mad about d2 (not d1). I'm not sure using public data is "stealing". In summary, these are two different things & need to be separate.
- redleader55 9 months ago
  
  You say "public", but what I think you mean is "publicly available". Even publicly available data has copyrights, and unless that copyright is "public domain", you need to follow some rules. Even licenses like Creative Commons, which would be the most permissive, come with caveats which OpenAI doesn't follow [0].
  It is unclear if someone breaking someone else's copyright to use A can claim copyright on a work B, derived from A. My point is that OpenAI played loose with the copyright rules to build its various models, so the legality of their claims against DeepSeek might not be so strong.
  [0] https://creativecommons.org/share-your-work/cclicenses/
  - the_arun 9 months ago
    
    I am not saying OpenAI did good by using publicly available data. I meant these are separate activities. None is good. But DeepSeek is slightly better by making theirs opensource.
- xbar 9 months ago
  
  OpenAI (sc)raped all the data it could. I do not accept your assertion that d1 was "public." It was accessible, for certain.
  OpenAI asserts 1. d2 was used by DeepSeek 2. All d2 belongs to OpenAI exclusively
  Both are debatable for large number of reasons.
api 9 months ago

So far the whole business model of Silicon Valley since social media has been to monetize other peoples' content given out for free. The whole empire is built on this.
I wonder if this is going to come to an end through a combination of social media fatigue, social media fragmentation, and open source LLMs just giving it all back to us for free. LLMs are analogous to a "JPEG for ideas" so they're just lossy compression blobs of human thought expressed through language.
- barnabee 9 months ago
  
  > So far the whole business model of Silicon Valley since social media has been to monetize other peoples' content given out for free. The whole empire is built on this.
  It cannot die soon enough
amelius 9 months ago

It looks like they want to spin this as "DeepSeek copied OpenAI". The general public/media might actually believe this is what happened.
28304283409234 9 months ago

ClosedAI? StolenAI!
didip 9 months ago

fr fr, ClosedAI is being a comedian right now.
They scraped literally all the content of the internet without permissions. And I won't even be surprised if they scraped the output of other LLMs as well.
stackghost 9 months ago

[flagged]
- dang 9 months ago
  
  Ok, but please don't break HN's rules when commenting here.
  You may not owe altmen better, but you owe this community better if you're participating in it.
  https://news.ycombinator.com/newsguidelines.html
  - stackghost 9 months ago
    
    Once again you abuse your moderator powers to enforce your personal vendetta against people who dare to speak ill of tech CEOs.
    I find your behavior repulsive and fervently wish you would quit.
    
    dang 9 months ago
    
    This is what people say when they don't want the rules to be applied even-handedly.
    It's not a borderline call—I'd post exactly the same thing regardless of who or what such a comment was about.
    
    stackghost 9 months ago
    
    >This is what people say when they don't want the rules to be applied even-handedly.
    Not even close.
    This guy is actively ruining society while enriching himself in the process, but we somehow can't call a spade a spade?
    Pathetic.
    
    dang 9 months ago
    
    I suppose my chances of getting a straight answer aren't too good right now but I'd love to hear your thoughts on something.
    HN's stated mandate is intellectual curiosity (https://news.ycombinator.com/newsguidelines.html, https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...).
    Do you feel that your comment https://news.ycombinator.com/item?id=42866108 was curious (in that sense)? or was it rather that you feel something else is more important?
- bloomingkales 9 months ago
  
  I personally love this chef's kiss of a flip flop sam did here:
  https://blog.samaltman.com/trump
  https://www.reddit.com/r/YAPms/comments/1i7ry5m/sam_altman_g...
  Only a truly talented piece of shit can be as prolific as this.
  "He is irresponsible in the way dictators are."
  Chef's kiss.
  Edit:
  Kids, don't aspire to be like Altman. We as a community need to espouse more values than tech is gonna tech.
  - JumpCrisscross 9 months ago
    
    > don't aspire to be like Altman
    And don’t aspire to be like those who saw what he is but made peace with it in exchange for silver.
    
    gadders 9 months ago
    
    You mean all of the YC management, including PG?
    
    buran77 9 months ago
    
    Well, anyone who will flex their spine in every (im)possible position as required of them, just to get even more money and power.
    I could understand that from someone with an empty stomach. But so many people doing it when their pockets are already overflowing is exactly the kind of rot that degrades an entire society.
    We're all just seeing the results so much better now that they can't even be bothered to pretend they ever more than this.
    Later edit: The way this submission fell ~400th spots after just two hours despite having 1250 points and 550 comments, had its comments flagged and shuffled around to different submissions as soon as they touched too close to YC&Co is a good mirror of how today's society works.
    
    ToucanLoucan 9 months ago
    
    It's an addiction. There's no amount of money that will be enough, there's no amount of power that will be enough. They'll burn the world for another hit, and we know that because we've been watching them do it for 50 years now.
    
    stackghost 9 months ago
    
    Yes.
    
    RIMR 9 months ago
    
    Yes. Especially them.
    
    toxic 9 months ago
    
    Yes.
    
    istjohn 9 months ago
    
    Hey now, that's not very curious of you. /s
    
    marxisttemp 9 months ago
    
    Paul Graham now reposts right wing grift media on his Twitter profile, he’s cooked
    
    marxisttemp 9 months ago
    
    Why do you rightoid cucks downvote me every time I say facts about your cult leader
  - SteveGerencser 9 months ago
    
    > don't aspire to be like Altman
    Aspire to be like Aaron Schwartz.
    
    oooyay 9 months ago
    
    I've read a lot about Aaron's time at Reddit / Not A Bug. I somewhat think his fame exceeds his actual accomplishments at times. He was perceived to be very hostile to his peers and subordinates.
    Kind of a cliche, but aspire to be the best version yourself every day. Learn from the successes and failures of others, but don't aspire to be anyone else because eventually you'll be very disappointed.
    
    wongarsu 9 months ago
    
    That's what happens to martyrs. They become larger than life and history remembers an idealized version of them
    
    bayindirh 9 months ago
    
    The gist is, when you find your biggest flaw, work on it, and repeat; you’ve already gone great distance.
    
    baudehlo 9 months ago
    
    I knew Aaron back in my IRC days. He hung out with us to talk about RDF for a good couple of years. We chatted almost every day.
    He was lovely. And a genius. Maybe he changed, but he was a truly nice person.
    
    oooyay 9 months ago
    
    Yeah, definitely not a statement on Aaron himself. More a statement on idolizing people. There will always be instances where they didn't live up to what people think of them as. I think Aaron was fine and a normal human being.
    
    reaperman 9 months ago
    
    * Swartz
    But yes.
    
    some_furry 9 months ago
    
    (Except for the tragic ending, of course.)
    
    lukan 9 months ago
    
    If more would be like him, there might be a happy ending.
    
    some_furry 9 months ago
    
    Agreed.
    
    _hcuq 9 months ago
    
    Don't sell your soul, is all.
    But survive. This too will pass.
    
    ddingus 9 months ago
    
    Indeed
    
    ponector 9 months ago
    
    Why should kids aspire to be like Aaron if it is not rewarded in our society? Comparing with such "kings" as Altman or Musk.
    
    barnabee 9 months ago
    
    Why should anyone aspire to do what is rewarded over what they believe in and what will satisfy them?
    
    gessha 9 months ago
    
    Not everything virtuous is rewarded monetarily but we aspire to be virtuous, no?
    
    robotresearcher 9 months ago
    
    Aaron was not happy. Neither is Trump, or Musk. I don’t know if Bernie is happy, or AOC. Obama seems happy. Hilary doesn’t. Harris seems happy.
    Striving for good isn’t gonna be fun all the time, but when choosing role models I like to factor in how happy they seem. I’d like to spend some time happy.
    
    jncfhnb 9 months ago
    
    I think it’s fairly crazy that you believe you have an authentic view into the happiness levels of these people.
    
    robotresearcher 9 months ago
    
    I used the word ‘seem’ three times. I think it’s pretty unremarkable to report a personal impression without any claim of special insight.
    
    skeeter2020 9 months ago
    
    If human beings could be categorized as Happy/Not Happy the world would be a very boring place and life not worth living.
    
    ponector 9 months ago
    
    Musk looks happy throwing his hand from the heart to the sun.
    
    coliveira 9 months ago
    
    Modern society has stopped to aspire of being virtuous a long time ago. Unfortunately, that's nowadays a minority view.
    
    cratermoon 9 months ago
    
    Try to imagine a society where people only did things that were rewarded. Could such a society even exist? Thought experiment: make a list of all the jobs, professions, and vocations that are not rewarded in the sense you mean, and imagine they don't exist. What would be left?
    
    ponector 9 months ago
    
    I don't need to imagine. Teachers almost everywhere around the globe have poor salaries. In my country there are lower enrolment requirements to universities to become a school teacher than almost every other field of study. Means the dumbest students are there.
    And then later they go to the school to teach our future, working with high stress and low salary.
    Same with medical school in many countries where healthcare is not privatized. Insane hours, huge responsibilities and poor pay for doctors and nurses in many countries.
    Nowadays everyone wants to be an influencer or software developer.
    
    makapuf 9 months ago
    
    For teachers, sure. For medical doctors, in USA or Europe, I think they are much more paid than sw engineers.
    
    ponector 9 months ago
    
    In east EU, like Poland sw engineer makes two-three times more than a doctor with much less of effort, education and no responsibility.
    And nurses - they work at minimal salary in Poland. Even in USA if you count hourly rates it will be quite poor salary for nurses.
    
    gus_massa 9 months ago
    
    You mean better pay for teachers? It would be nice.
    (Since we are dreaming, can I add sane hours for medical doctors (like <= 8 per day)?)
    
    cratermoon 9 months ago
    
    Teachers, sure. But what about janitors & garbage collectors, paramedics, farm laborers, artists, librarians, musicians, case managers, religious/spiritual leaders?
    
    skeeter2020 9 months ago
    
    Because only one person can be king, but everybody can participate and contribute. Also there's too many things out side of just being "the best" that decide who gets to be king. Often that person is a terrible leader.
    
    miramba 9 months ago
    
    Upvoted not because I agree, but I think it‘s a valid question that shouldn‘t be greyed out. My kids dream job is youtube influencer, I don‘t like it but can I blame them? It‘s money for nothing and the chicks for free.
    
    ponector 9 months ago
    
    Tragedy of current days. No one wants to be a firefighter, astronaut or a doctor. Influencers everywhere! Can you blame kids? Do you know firefighters who earns million dollars annually?
    
    uoaei 9 months ago
    
    Because kids' brains are not as poisoned into believing the most profitable things to do are the most meritorious. Not yet, anyway.
    
    CalRobert 9 months ago
    
    We need them to help us build a better society.
    
    ponector 9 months ago
    
    Looks like we need salesmen much more as we value their work more.
    
    CalRobert 9 months ago
    
    We’re just terrible at pricing negative externalities and rewarding positive ones
    
    ibejoeb 9 months ago
    
    AaronSw exfiltrated data without authorization. You can argue the morality of that, but I think you could make the argument for OpenAI as well. I'm not opining on either, just pointing out the marked similarity here.
    edit: It appears I'm wrong. Will someone correct me on what he did?
    
    gessha 9 months ago
    
    Arguing for the morality of OpenAI is a little bit harder given their history and actions in the last few years.
    
    ibejoeb 9 months ago
    
    One argument would be means to an end, with the end being the initial advancement of AI.
    Again, I'm not offering an opinion on it.
    
    skeeter2020 9 months ago
    
    This is an argument, but isn't this where your scenario diverges completely? OpenAI's "means to an end" is further than you state; not initial advancement but the control and profit from AI.
    
    ibejoeb 9 months ago
    
    Yes, they intended for control and profit, but it's looking like they can't keep it under control and ultimately its advancements will be available more broadly.
    So, the argument goes that despite its intention, OpenAI has been one of the largest drivers of innovation in an emerging technology.
    
    ceejayoz 9 months ago
    
    > edit: It appears I'm wrong. Will someone correct me on what he did?
    He didn't do it without authorization.
    https://en.wikipedia.org/wiki/Aaron_Swartz
    > Visitors to MIT's "open campus" were authorized to access JSTOR through its network.
    
    ibejoeb 9 months ago
    
    At that same link is an account of the unlawful activity. He was not authorized to access a restricted area, set up a sieve on the network, and collect the contents of JSTOR for outside distribution.
    
    richardwhiuk 9 months ago
    
    He wasn't authorised to access the wiring closet. There are many troubling things about the case, but it's fairly clear Aaron knew he was doing something he wasn't authorised to do.
    
    ceejayoz 9 months ago
    
    > He wasn't authorised to access the wiring closet.
    For which MIT can certainly have a) locked the door and b) trespassed him, but that's a very different issue than having authorization to access JSTOR.
  - ViktorRay 9 months ago
    
    I don’t think your links are evidence of a flip flop.
    The first link is from mid-2016. The second link is from January 2025.
    It is entirely reasonable for someone to genuinely change his or her views of a person over the course of 8.5 years. That is a substantial length of time in a person’s life.
    To me a “flip-flop” is when one changes views on something in a very short amount of time.
    
    themaninthedark 9 months ago
    
    This is quite honestly one of the major problems with our society right now. Once you take a public stance, you are not allowed to revisit and re-evaluate. I think that this is by and large driving most of the polarization in the country, since "My view is right and I will not give an inch least I be seen as weak".
    While most of the things affected are highly political situations, i.e. Trump's ideas or Biden's fitness. We also seem to have thrown out things that we used to consider cornerstones of liberal democracy i.e. our ideas regarding free speech and censorship, where we claim that it's not happening because it is a private company.
    
    rybosworld 9 months ago
    
    Seems like an extremely naïve take.
    In 2016: Sam alluded to Trump's rise as not dissimilar to Hitler's. He said that Trump's ideas on how to fix things are so far off the mark that they are dangerous. He even quoted the famous: "The only thing necessary for the triumph of evil is for good men to do nothing."
    In 2025: "I'm not going to agree with him on everything, but I think he will be incredible for the country"
    This is quite obviously someone who is pandering for their own benefit.
    
    benatkin 9 months ago
    
    Just like JD Vance.
    
    meowface 9 months ago
    
    IMO it probably is and Altman probably still (rightly) hates Trump. He's playing politics because he needs to. I don't really blame him for it, though his tweet certainly did make me wince.
    
    bloomingkales 9 months ago
    
    "I don't really blame him for it"
    That's the thing though right, that we all created this mess together. Like yeah, why don't you (and the rest of us) blame him?. We're all pretty warped and it's going to take collective rehab.
    Super pretentious to quote MLK, but the man had stuff to say so here it is (on Inaction):
    "He who passively accepts evil is as much involved in it as he who helps to perpetrate it"
    "The ultimate tragedy is not the oppression and cruelty by the bad people but the silence over that by the good people"
    
    whatshisface 9 months ago
    
    It's not pretentious to quote Martin Luther King.
    
    benatkin 9 months ago
    
    It seems he was virtue signaling before. So it would be more accurate to blame him for having let himself become an ego driven person in the past. Or to put it nicely and to add the context of Brian Armstrong of Coinbase, who has also been showing public support for Trump, a mission-driven person.
    
    mrandish 9 months ago
    
    > It seems he was virtue signaling before.
    Yes, the first mistake was a business leader in tech taking a public political position. It was popular and accepted (if not expected) in the valley in 2016.
    Doing that then (and banking the social and reputational proceeds) created the problem of dissonance now. If he'd just stayed neutral in public in 2016, he could do what he's doing now and we could assume he's just being a pragmatic business person lobbying the government to further his company's interests.
    
    benatkin 9 months ago
    
    I think “progressive” is probably the safest position to take. It also works if you want to get involved in a different sort of politics later on. David Sacks had no problem doing that when he was no longer interested in being CEO of a large company.
    
    mrandish 9 months ago
    
    The evidence indicates not taking a position is the optimal position.
    I have a lot of respect for CEOs who just focus on being a good CEO. It's a hard enough job as is. I don't care about or want to know some CEO's personal position on politics, religion or sports teams. It's all a distraction from the job at hand. Same goes for actors, athletes and singers. They aren't qualified to have an opinion any more relevant than anyone else's, except on acting, athletics, singing - or CEO-ing.
    Sadly, my perspective is in the minority. Which is why I think so many public figures keep making this mistake. The media, pundits and social sphere need them to keep making this mistake.
    
    benatkin 9 months ago
    
    I guess I think they should study what a neutral position looks like, and avoid going beyond it as best as they can. I had in mind a "progressive" who avoids any hot button issues. Someone with a high profile will be asked about politics from time to time. I think Brian Chesky is a good example of acting like a progressive in a way that stays low profile, but maybe he doesn't really act like one. https://www.businessinsider.com/brian-chesky-airbnb-new-bree...
    Also it helps to have sincere political views. GitHub's CEO at the time of #DropICE was too cynical and his image suffered because of it.
    
    mrandish 9 months ago
    
    > study what a neutral position looks like
    There are no neutral positions in today's political landscape. I'm not stating my opinion here, this is according to most political positions on the spectrum. You suggested "Progressive" (but without hot button issues) as a way of signaling a neutral position. That may be true in parts of the valley tech sphere but it certainly doesn't hold in the rest of the U.S. "Progressive" is usually defined being to the left of "Liberal", so it's hardly neutral. Over half of U.S. voters cast their ballot for the Republican candidate. Almost all those people interpret anyone identifying themselves as "Liberal" as definitely partisan (and negative, of course). Most of them see "Progressive" as even worse, slipping dangerously toward "Socialist". And the same holds true for the term "Conservative" on the other side of the spectrum, of course.
    No, identifying as "Progressive" wouldn't distance you from political connotations and culture warring, it's leaping into the maelstrom yelling "Yipee-Ki-Yay!" You may want to update your priors regarding how the broad populace perceives political labels. With voters divided almost exactly in half regarding politics and cultural war issues and a large percentage on both sides having "Strong" or "Very Strong" feelings, stating any position will be seen as strongly negative by tens of millions of people. If you're a CEO (or actor, athlete, singer, etc) who relies on appealing to a broad audience, when it comes to publicly discussing politics (or religion), the downsides can be large and long-lasting but the upsides are small and fleeting. As was said in the movie "WarGames", the only winning move is not playing.
  - scotty79 9 months ago
    
    I especially like how he quoted Napoleon or something framing himself as the heart of revolution and Deep Seek as a child of the revolution only to get a response from some random guy "It's not that deep bro. Just release a better model."
    https://x.com/hibakod/status/1883189126553596234
    
    bloomingkales 9 months ago
    
    [flagged]
    
    api 9 months ago
    
    Elon actually did engineering and had some good ideas. I'm talking pre-social-media-brain-rot Elon.
    
    IntelMiner 9 months ago
    
    Elon was fired from PayPal partially because he wanted to replace "old ugly mainframes" (Linux and UNIX machines) with the "cutting edge" Windows NT
    
    api 9 months ago
    
    This only looks stupid in hindsight.
    I worked on something back then that had to interface with payment networks. All the payment networks had software for Windows to accomplish this that you could run under NT, while under Linux you had to implement your own connector -- which usually involved interacting with hideous old COBOL systems and/or XML and other abominations. In many cases you had to use dialup lines to talk to the banks. Again, software was available for Windows NT but not Linux.
    Our solution was to run stuff to talk to banks on NT systems and everything else on Linux. Yes, those NT machines had banks of modems.
    In the late 90s using NT for something to talk to banks is not necessarily a terrible idea seen through the lens of the time. Linux was also far less mature back then, and we did not have today's embarrassment of riches when it comes to Linux management and clustering and orchestration software.
    
    oooyay 9 months ago
    
    > This only looks stupid in hindsight.
    If you're a tech leader and confuse Linux boxes for mainframes then I don't think it's hindsight that makes you look foolish. It's that you do not, in fact, understand what you're talking about or how to talk about it - which is your job as a tech leader.
    
    IntelMiner 9 months ago
    
    Under Linux perhaps. But a web company running on NT instead of Solaris in the 90's? I mean you could but you'd be hobbled pretty hard
    Especially around the era Musk is quoted for (NT4 in the late 90's) I think most people would be understandably critical, even at the time
    
    Hasu 9 months ago
    
    > This only looks stupid in hindsight.
    It looked stupid enough at the time to get him fired for it.
    
    belter 9 months ago
    
    Did he ever manage to run that Python script?
    
    jcgrillo 9 months ago
    
    Didn't he lie about having a physics degree?
    
    indoordin0saur 9 months ago
    
    Yeah Elon has gotten annoying (my god has he been insufferable lately) but his companies have done genuine good for the human race. It's really hard for me to think of any of the other recently made billionaires who have gotten rich off of something other than addicting devices, time-wasting social media and financial schemes.
  - belter 9 months ago
    
    "Donald Trump represents an unprecedented threat to America, and voting for Hillary is the best way to defend our country against it"
    - Sam Altman - 2016
    "If you elect a reality TV star as President, you can't be surprised when you get a reality TV show"
    - Sam Altman - 2017
    "When the future of the republic is at risk, the duty to the country and our values transcends the duty to your particular company and your stock price."
    - Sam Altman - 2017
    "I think I started that a little bit earlier than other people, but at this point I am in really good company"
    - Sam Altman - 2017 ( On his criticism of Trump )
    "Very few people realize just how much @reidhoffman did and spent to stop Trump from getting re-elected -- it seems reasonably likely to me that Trump would still be in office without his efforts. Thank you, Reid!"
    - Sam Altman - 2020
  - hibikir 9 months ago
    
    As a society we might talk about virtue, but the reason we put it as a goal in stories is that in the real world, we don't reward it. It's not just that corruption wins sometimes, but we directly punish those that fight it. The mood of the times, if anything, comes from people realizing that what we called moral behavior leads to worse outcomes for the virtuous.
    A community only espouses good values when it punishes bad behavior. How do we do this when those misbehaving are very rich, and attempting to punish the misbehavior has negative consequences on you? There just aren't many available tools that don't require significant sacrifices.
    
    js8 9 months ago
    
    > A community only espouses good values when it punishes bad behavior.
    This is the "beauty" of the free market ideology (see e.g. https://a16z.com/the-techno-optimist-manifesto/ ). If all the transactions are voluntary, there is no way to punish anyone.
    
    stevenAthompson 9 months ago
    
    > If all the transactions are voluntary, there is no way to punish anyone.
    This is obviously untrue at face value. See: Cancel Culture, Bud Light, and Freedom Fries for examples.
    Did you mean something more than what you stated here?
  - hn_throwaway_99 9 months ago
    
    That is particularly gross, but that really feels like the norm among all the tech elite these days - Zuckerberg, Bezos, etc. all doing the most laughable flip flops.
    The reason the flip flops are so laughable to me is because they attempt to couch them in some noble, moralistic viewpoint, instead of the obvious reason "We own big companies, the government has extreme power to make or break these companies, and everyone knows kissing up to Trump is what is required to be on his good side."
    Profiles in Cowardice, every last one of them.
    
    jcgrillo 9 months ago
    
    Another point of view is that they never flopped or flipped. They were fascists the whole time and were just lying about it before.
    
    hn_throwaway_99 9 months ago
    
    I think Tim Sweeney's (CEO of Epic Games) comment was spot on:
    > After years of pretending to be Democrats, Big Tech leaders are now pretending to be Republicans, in hopes of currying favor with the new administration. Beware of the scummy monopoly campaign to vilify competition law as they rip off consumers and crush competitors.
    This is exactly what OpenAI is trying to do with these allegations.
    
    stevenAthompson 9 months ago
    
    Those men and their companies are responsible for hundreds of thousands of jobs and a significant portion of the global economy. I'm actually thankful that they aren't shooting their mouths off to the new boss like spoiled children at their first job. It wouldn't make the world better, it would make their companies and the lives of those who depend on them, worse.
    There is a fine line between cowardice and common sense.
    
    jcgrillo 9 months ago
    
    In what sense is the federal government "the boss" of private sector businesses? This isn't an oligarchy yet, right? They don't have to behave obsequiously, they are choosing to. They're doing it for themselves, not for their shareholders or their employees. It's an attempt to grab power and become oligarchs because they see in this government a gullible mark.
    
    stevenAthompson 9 months ago
    
    > This isn't an oligarchy yet, right?
    The richest man in the world has a government office down the street from the white house, which the taxpayers are funding. He's rumored to sleep there.
    What do you think?
    
    jcgrillo 9 months ago
    
    I think we're close, and they're trying damn hard. We'll see what happens.
    
    hn_throwaway_99 9 months ago
    
    Puhleeeese. I'm not advocating that these leaders all lead protest marches against the new administration. But the transparent obsequiousness and Trump ball gargling under the guise of some moralistic principles is so nauseating. And please spare me the idea that the likes of Zuckerberg or Bezos gives a rat's ass about their employees.
    For a contrast to the Bezos, Zuckerberg and Altman types, look at Tim Cook. Sure, Apple paid the 1 million inauguration "donation", and Cook was at the inauguration, and I'm not arguing he's winning any "Profiles in Courage" awards, but he didn't come out with lots of tweets claiming how massuh Trump is so wise and awesome, Apple didn't do a 180 on their previous policies, etc.
  - meowface 9 months ago
    
    Although I dislike him now glazing Trump, I understand why he's doing it. Trump runs a racket and this is part of the game.
    One of my most contrarian positions is I still like and support Altman, despite most of the internet now hating him almost as much as they (justifiably) hate Elon. Was a fan of Sam pre-YC presidency and still am now.
    (I also am a big fan of DeepSeek and its CEO.)
    
    CoastalCoder 9 months ago
    
    In the interest of helping avoid an echo chamber, would you mind giving some of the things you like about current Altman?
    
    robotresearcher 9 months ago
    
    For me, it’s the technical results. Same as for Musk.
    Tesla accelerated us forward into the electric car age. SpaceX revolutionized launches.
    OpenAI added some real startup oomph to the AI arms race which was dominated by megacorps with entrenched products that they would have disrupted only slowly.
    So these guys are doing useful things, however you feel about their other conduct. Personally I find the gross political flip-flops hard to stomach.
    
    benterix 9 months ago
    
    I'd love to hear something positive about the current Altman, too. Anything would be good.
    
    whatshisface 9 months ago
    
    Why would you support someone you said was part of a racket in the sentence before? We're talking about real life, where actions have consequences, not a TV show where we're expected to identifiy with Tony Soprano.
    
    meowface 9 months ago
    
    I do not support the runner of the racket (Trump). The victims of the racket who are forced to comply are who I'm sympathizing with.
    I don't support the mafia, but I sympathize with the local store that pays the mafia its racket dues to avoid being ransacked.
    
    whatshisface 9 months ago
    
    In your model, the underboss is a victim of the don?
    
    meowface 9 months ago
    
    It's just the cost of doing business. Sam Altman isn't Trump's underboss. In this scenario that would be David Sacks or Elon Musk or something.
  - the_optimist 9 months ago
    
    [flagged]
  - breakyerself 9 months ago
    
    If you didn't sexually assault your sister you're already off to a good start.
- benreesman 9 months ago
  
  The guy is a total fucking psycho and the rest of the board are no gems either.
  Their failure is important at a minimum to the future of the United States if not the world.
- Dansvidania 9 months ago
  
  Indeed. First thing I thought was "call a wahmbulance!".
- tootie 9 months ago
  
  The coup against him is looking more and more like a huge "I told you so" moment.
- api 9 months ago
  
  I'm sure him lining up to kiss Trump's ring for some kind of bailout is not a coincidence.
- 65 9 months ago
  
  Yeah I don't know, Altman is a sociopath who is now trying to get intertwined with local governments (SF) as well as the federal government. He's going to do a lot of weaseling to get what he wants: laws that forcibly make OpenAI a monopoly.
  Society will always have crazy sociopaths destroying things for their own gain, and now is Altman's turn.
  - codr7 9 months ago
    
    [flagged]
- _gnad_ 9 months ago
  
  [dead]
- blackeyeblitzar 9 months ago
  
  I don’t care for Sam Altman and his general untrustworthy behavior. But DeepSeek is perhaps more untrustworthy. Models from American companies at least aren’t surprising us with government driven misinformation, and even though safety can also be censorship, the companies that make these models at least openly talk about their safety programs. DeepSeek is implementing a censorship and propaganda program without admitting it at all, and once they become good at doing it in less obvious ways, it can become very damaging and corrupt the political process of other societies, because users will trust the tools they use are neutral.
  I think DeepSeek’s strategy to announce a misleading low cost (just the final training run that optimizes a base model that in turn is possibly based on OpenAI) is also purposeful. After all, High Flyer, the parent company of DeepSeek, is a hedge fund - and I bet they took out big short positions on Nvidia before their recent announcements. The Chinese government, of course, benefits from a misleading number being announced broadly, causing doubt among investors who would otherwise continue to prop up American technology startups. Not to mention the big fall in American markets as a result.
  I do think there’s also a big difference between scraping the Internet for training data, which might just be fair use, and training off other LLMs or obtaining their assets in some other way. The latter feels like the kind of copying and industrial espionage that used to get China ridiculed in the 2000s and 2010s. Note that DeepSeek has never detailed their training data, even at a high level. This is true even in their previous papers, where they were very vague about the pre training process, which feels suspicious.
  - dheera 9 months ago
    
    > I bet they took out big short positions on Nvidia before their announcements
    Good for them! I hope this teaches Wall Street to not freak out about an unverified announcement.
    Wall Street lost billions, and I hope they learned their lesson and next time will not crash the market when unverified news comes out.
  - tempusalaria 9 months ago
    
    DeepSeek v3 (where the training cost claims come from) was announced a month ago and it had no impact outside of a small circle
  - ryanisnan 9 months ago
    
    > Models from American companies at least aren’t surprising us with government driven misinformation, and even though safety can also be censorship
    Being a citizen of a western nation, I'm inclined to agree with the general sentiment here, but how can you definitively say this? You, or I, don't know with any certainty what interference the US government has played with domestic LLMs, or what lies they have fabricated and cultivated, that are now part of those LLMs' collective knowledge. We can see the perceived censorship with deepseek more clearly, but that isn't evidence that we're in any safer territory.
  - pphysch 9 months ago
    
    > Models from American companies at least aren’t surprising us with government driven misinformation
    There are loads of examples on the internet of LLMs pushing (foreign) government narratives e.g. on Israel-Palestine.
    Just because you might agree with the propaganda doesn't make it any less problematic.
    
    blackeyeblitzar 9 months ago
    
    > There are loads of examples on the internet of LLMs pushing (foreign) government narratives e.g. on Israel-Palestine
    There isn’t even a single example of that. If an LLM is taking a certain position because it has learned from articles on that topic, that’s different from it being manipulated on purpose to answer differently on that topic. You’re confusing an LLM simply reflecting the complexity out there in the world on some topics (showing up in training data), with government forced censorship and propaganda in DeepSeek.
    The two aren’t the same, not even remotely close.
    
    pphysch 9 months ago
    
    Fine, whatever. It's actually much more concerning if the overall information landscape has been so curated by censors that a naively-trained LLM comes "pre-censored", as you are asserting. This issue is so "complex" when it comes to one side, and "morally clear" when it comes to the other. Classic doublespeak.
    That's far more dystopian than a post-hoc "guardrailed" model (that you can run locally without guardrails).
  - cycomanic 9 months ago
    
    > I don’t care for Sam Altman and his general untrustworthy behavior. But DeepSeek is perhaps more untrustworthy. Models from American companies at least aren’t surprising us with government driven misinformation, and even though safety can also be censorship, the companies that make these models at least openly talk about their safety programs. DeepSeek is implementing a censorship and propaganda program without admitting it at all, and once they become good at doing it in less obvious ways, it can become very damaging and corrupt the political process of other societies, because users will trust the tools they use are neutral.
    These arguments always remind me of the arguments against Huawei because they _might_ be spying on western countries. On the other hand we had the US government working hand in hand with US corporations in proven spying operations against western allies for political and economic gain. So why should we choose an American supplier over a Chinese one?
    > I think DeepSeek’s strategy to announce a misleading low cost (just the final training run that optimizes a base model that in turn is possibly based on OpenAI) is also purposeful. After all, High Flyer, the parent company of DeepSeek, is a hedge fund - and I bet they took out big short positions on Nvidia before their recent announcements. The Chinese government, of course, benefits from a misleading number being announced broadly, causing doubt among investors who would otherwise continue to prop up American technology startups. Not to mention the big fall in American markets as a result.
    Why should I care about the stock value of US corporations?
    > I do think there’s also a big difference between scraping the Internet for training data, which might just be fair use, and training off other LLMs or obtaining their assets in some other way.
    So if training of copyrighted work scrapped of the Internet is fair use, how would the training of the LLMs not be fair use as well? You can't have it both ways.
  - vohk 9 months ago
    
    > Models from American companies at least aren’t surprising us with government driven misinformation
    Is corporate misinformation so much better? Recall about Tienanmen Square might be more honest but if LLMs had been available over the past 50 years, I would expect many popular models would have cheerfully told us company towns are a great place to live, cigarettes are healthy, industrial pollution has no impact on your health, and anthropogenic climate change isn't real.
    Especially after the recent behaviour of Meta, Twitter, and Amazon in open support of Trump and Republican interests, I'll be shocked if we don't start seeing that reflected in their LLMs over the next few years.
semking 9 months ago

[flagged]
- zodiakzz 9 months ago
  
  I feel tracked.
  - semking 9 months ago
    
    [flagged]
Imustaskforhelp 9 months ago

Yes the irony is so thick in the air that it can be cut through using a swiss knife lol
I had literally come to this post to say the same. You beat me to it.
USA is going crazy over deepseek and to me , it just shows that the world is a black swan , an AI bubble.
I am not saying AI has no use. I regularly use it to create something , but its just not recommended. I am going to stop using AI , to grow my mind.
And its definitely way overpriced. People are investing so much money without seeing the returns? , and I think people are also using AI because of a sense of FOMO , I don't know , to me its funny .
I really really want to create a index fund with strictly no AI companies. Since this doesn't feel diversified enough. Like sure nvidia gave a quarter of return the last year , but I mean , at this point , it almost feels the same as that of bitcoin. The reason I don't / won't invest in bitcoin is I don't want "that" risk.
This has been a boggling year.
I have realized that the world is crazy. Truly. Trump winning from going to the point of getting shot to deepseek causing nvidia / american stock market to go down , heck even bitcoin! , its so crazy , trump launching his meme coin. If the world is crazy. Just be the sane person around. You will stick around , that's my philosophy. I won't jump on AI wandwagon . But its still absolutely wild & horror seeing how a "sideproject" (deepseek) absolutely put american stock market in shambles.
I want more diversifaction. I am not satisfied with the current system. This feels like a bubble and I want no part in it.
cscurmudgeon 9 months ago

Scraping data is different from scraping outputs from a model.
- Rebelgecko 9 months ago
  
  Copyright is weird and often legal ≠ moral, but I'm having a hard time constructing a mental model where it's ok to scrape a novel written by a person but it's not ok to scrape a story written by chatgpt
- jacobgorm 9 months ago
  
  No it is not, data is data, whether it gets loaded from a file on disk or generated by multiplying lots of matrices.
- openrisk 9 months ago
  
  Because the data is mine and the model is yours?
- idle_zealot 9 months ago
  
  Like, in a strict literal sense, sure? Do you mean to make a claim about moral or legal differences?
- wkz 9 months ago
  
  Technically, sure. What is the moral distinction though?

yapyap 9 months ago

It sounds like they’re just jealous and trying to smear shit over the wall and see what sticks.

DeepSeek just bodied u bro, get back in the lab & create a better AI instead of all this news that isn’t gonna change them having a good AI

zx10rse 9 months ago

OpenAI is already irrelevant but the audacity oh my.

vjerancrnjak 9 months ago

I thought this is capitalism for the winners. Why slander competition, just outcompete them? Why stick to your losing bets if you’ve recognized a better alternative?

Let’s race to the bottom.

vcryan 9 months ago

I love watching billionaires squirm

jeanlucas 9 months ago

damn that's a good headline

oatmeal_croc 9 months ago

Even if true, so what? These are increasingly looking like a competition between nation-states with their trade embargoes and export controls. All's fair in AI wars.

paulhart 9 months ago

"You are trying to kidnap what I have rightfully stolen"

sonabinu 9 months ago

poetic justice (pun intended)

khazhoux 9 months ago

I'm disappointed that 99% of the comments about this topic are Schadenfreude, and 1% is actually about the technical implications of OpenAI's claims.

hello_computer 9 months ago

then show it to us rachel

josefritzishere 9 months ago

OpenAI, who comitted copyright infringement on an massive scale, wants to defend against a superior product won the basis of infringement? What nonsense.

nottorp 9 months ago

IP thief cries IP thief.

It's okay when you steal worldwide IP to train your "AI".

It's not okay when said stolen IP is stolen from you?

If the chinese are guilty, then Altman's doom and gloom racket is as guilty or even more, considering they stole from everyone.

fedeb95 9 months ago

if some kind of transitivity holds, DeepSeek stole billions of internet users data.

mk89 9 months ago

What a joke OpenAI has become.

asah 9 months ago

Thieve's honor, hunh?

witnesser2 9 months ago

Soon another layer of distiller will emerge. Selling purer booze in this weight tuning buzzi.

pshirshov 9 months ago

A thief got robbed?..

coldpepper 9 months ago

Fuck openai. They didn't ask my peemission to crawl my blog into their dataset.

thumbsup-_- 9 months ago

is stealing from the thief actually a theft?

coolhand2120 9 months ago

I was using the Deepseek model last night and got this:

> You > do you understand copredication?

> Assistant > deepseek-coder-33b-instruct > I'm sorry for any misunderstanding, but as an AI model developed by OpenAI, I don't have personal ...

Yikes! That's pretty damning IMO.

octacat 9 months ago

first time?

pointedAt 9 months ago

it's a corona style portfolio gang.

52-6F-62 9 months ago

I heard they were just “democratizing” llm and ai development.

Yesterday the industry crushed pianos and tools and bicycles and guitars and violins and paint supplies and replaced them with a tablet computer.

Tomorrow we can replace craven venture capitalists and overfed corporate bodies with incestuous LLM’s and call it all a day.

seydor 9 months ago

But now OpenAI will use DeepSeek to reuse even more stolen data to train new models that they can serve without ever giving us the code, the weights or even the thinking process , and they will still be superior

exabrial 9 months ago

cry us copyright holders a river.

seanp2k2 9 months ago

"lol" said the Scorpion, "lmao".

boxedemp 9 months ago

Deep refers to itself as ChatGPT sometimes lol

htrp 9 months ago

In other news.....water is wet

FpUser 9 months ago

Pot calling kettle black?

otikik 9 months ago

Chatgpt, please generate an image of the tiniest violin imaginable.

Oh wait I will ask DeepSeek instead.

pcthrowaway 9 months ago

Now that China is talking about lifting the Great Firewall, it seems like the U.S. is on track to cordon themselves off from other countries. Trump's talk of building a wall might not stop at Mexico.

cratermoon 9 months ago

Maybe the VCs backing OpenAI invest in tiny violins.

hsuduebc2 9 months ago

The pot calling the kettle black

anon115 9 months ago

eat shit

worik 9 months ago

[flagged]

Austiiiiii 9 months ago

And we'd be shooting ourselves in the foot to do so. If America is forced to use only the clunky corporate-owned American AI at a fee, we'll very quickly fall behind competitors worldwide who use DeepSeek models to produce better results for much, much cheaper.
Not to mention it'd defeat the whole purpose of a "free market" economy. (Not that that means much of anything anymore)
- kelnos 9 months ago
  
  It never meant anything. There's no such thing as a free market economy. We haven't had one of those in modern times, and arguably human civilization has never had one. Markets and their participants have chronically been subject to information asymmetry, coercion/manipulation, and regulation, among other things.
  I don't think all of that is a bad thing (regulation tends to make it harder to do the first two things), but "free markets" are the economic equivalent to the "point mass" in physics: perhaps useful sometimes to create simple models and explanations of things, but will never exist in the real world.
  - worik 9 months ago
    
    Yes, technically and pedantically you are correct.
    But restricting the trade in micro chips only because the USA is afraid it will loose a technical and commercial edge is a long long way from a free market.
    It is too late, too. China has broken out and they are ahead in many fields. Not trading chips with them will make them build their own foundries. In two decades they will be as far ahead there as they are in many other fields.
    If the USA would trade then the technological capacities of China and the USA would stay matched, as they help each other. China ahead in some areas, the USA ahead in others.
    That would still (probably) not be a pure Free Market but it would be a freer market, and better for everybody except a few elites (on both sides)
    Madness is taking root
- iforgot22 9 months ago
  
  The Nvidia export restrictions also might be shooting us in the foot too, or at least Nvidia. They really benefit from CUDA remaining the de facto standard.
  - segasaturn 9 months ago
    
    The Nvidia export restrictions have already harmed Nvidia. Deepseek-R1 is efficient on compute and was trained on old, obsolete cards because they couldn't get ahold of Nvidia's most cutting edge tech, so they were forced to innovate instead of just brute-forcing performance with faster cards. That has directly resulted in Nvidia's stock crashing over the last week.
    
    iforgot22 9 months ago
    
    That's true, but I mean it'd be a hundred times worse if Deepseek did this on non-Nvidia cards. Which seems like only a matter of time if we're going to keep doing this.
    
    talldayo 9 months ago
    
    > Which seems like only a matter of time if we're going to keep doing this.
    What's funny is that people have been saying this since OpenCL was announced but today we're actually in a worse spot than we were 10 years ago. China too - their inability to replicate EULV advancements has left their lithography in a terrible place.
    There's a reason China is literally dependent on Nvidia for competitive hardware. It's their window into export-controlled TSMC wafers and complex streaming multiprocessor designs. Losing access to Nvidia hardware isn't an option for them (or the United States for that matter) which is why the US pushes so hard for export controls. There is no alternative.
    
    iforgot22 9 months ago
    
    Well I wasn't optimistic about OpenCL in the past, because nobody will bother with that when they can pay a little (or even a lot) more for Nvidia and use CUDA. Even though OpenCL might work in theory with whatever tools someone is using, it's unpopular and therefore less supported. But this time is different.
    
    talldayo 9 months ago
    
    > But this time is different.
    Is it? Time will tell, but it wasn't "different" even during the crypto craze when CUDA was literally printing money. We were promised world-changing ASICs just like with Cereberas and Groq, and ended up with nothing in the end.
    When AI's popularity blows over (and it will, like crypto), will the market have responded fast enough? From where I'm standing, it looks like every major manufacturer (and even the Chinese market) is trying the ASIC route again. And having watched ASICs die a very painful and unsatisfying death in the mining pools, I'd like to avoid manufacturing purpose-made ICs that are obsolete within months. I'm simply not seeing the sort of large-scale strategy that threatens Nvidia's actual demand.
    
    iforgot22 9 months ago
    
    I don't have any hope for ASICs. GPUs are going to stay, but if enough big countries are only able to obtain new non-Nvidia GPUs, big corps and even governments will find it worthwhile to build up the ecosystem around OpenCL or something else. Before now, CUDA had unstoppable momentum.
    Crypto mining was just about hash rates, so I don't think it really mattered whether you used CUDA or not. Nvidia cards were just faster usually. People did use AMD too, but that didn't really involve building up the OpenCL ecosystem, just making it run one particular algo. They do also use ASICs for BTC in particular, I don't think that died.
- onlyrealcuzzo 9 months ago
  
  DeepSeek's technology is out of the bag.
  Every LLM provider in the US will be using it to lower OpEx.
  One of them is likely to pass those savings along to consumers to gain market share.
  Facebook is in the business of providing weights for free.
  The idea that we are all doomed unless we immediately migrate to DeepSeek is fantasy.
sangnoir 9 months ago

> Banned in the USA. Only.
The US government has the wherewithal to drag Europe along with it, like they did with Huawei's 5G equipment.
- roblabla 9 months ago
  
  So far, a general tiktok ban (as opposed to a tiktok ban on things like government phones) has only been in effect in the USA. I highly doubt Europe would play ball at any attempt at banning imports of DeepSeek.
  Besides, it's kinda too late for this. The model is freely accessible, so any attempt at banning it would be _completely_ moot. If DeepSeek keeps releasing their future models for free, I don't see how a ban could ever be effective at all. Worse case scenario, big tech can't use those models... but then individuals (and startups willing to go fast and break laws) will be able to use them and instantly get a leg up on the competition.
- HotHotLava 9 months ago
  
  Used to. If they're going to start a trade war, pull out of NATO and invade Greenland instead, there'll not be much soft power left to drag Europe anywhere.
- crote 9 months ago
  
  Perhaps a few years ago, yes. But considering that the US president is now threatening to invade and annex part of a European country, I don't think past results can be extrapolated to the present.
- realusername 9 months ago
  
  Not really anymore, and I would bet that the way things are going, Twitter will be banned in the EU before Tiktok.
- buyucu 9 months ago
  
  And yet, Huawei is still doing fine.
buyucu 9 months ago

I have a Xiaomi phone, a Huawei Matebook and a BYD car. I guess I am pivoting to China after all :)
dang 9 months ago

We detached this subthread from https://news.ycombinator.com/item?id=42866072.
greenchair 9 months ago

world is pretty small as we see with all nations kissing the ring in the last few weeks. pivot away from chinar will accelerate.
dismalaf 9 months ago

[flagged]
- dang 9 months ago
  
  Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
  If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
- segasaturn 9 months ago
  
  BYD cars are everywhere in Latin America and Europe. Xiaomi phones also.
  - faizmokh 9 months ago
    
    A much better build compared to Tesla.
    
    worik 9 months ago
    
    > A much better build compared to Tesla.
    Really? I am intrigued. Why do you think that?
    
    crote 9 months ago
    
    The Cybertruck is well-known for breaking down for the silliest reasons and losing parts during regular driving.
    I have no idea what the build quality of BYD is, but doing better than Tesla isn't exactly a challenge.
  - dismalaf 9 months ago
    
    I live in Europe half the time. Literally never seen a BYD car in the flesh.
    Chinese phones, yes. But I'd argue we're past peak China. Huawei phones briefly were the #1 selling in the world, have since pulled back.
  - frosting1337 9 months ago
    
    Same in Australia. Xiaomi phones less so.
- alecco 9 months ago
  
  What about Brazil.
  - Vox_Leone 9 months ago
    
    Brazil's technological environment is stagnant and apathetic. And it has been that way since the tragic explosion that claimed the country's space program. It seems that all of the country's technological activities have suffered the blow. It is possible that now it will finally be able to manufacture its national tier 3 AI, using the outputs of DeepSeek.
  - dismalaf 9 months ago
    
    [flagged]
    
    Spivak 9 months ago
    
    [flagged]
    
    dismalaf 9 months ago
    
    What's happening at home? From a foreigner's (Canadian) perspective, Trump's not doing anything crazy but the media is going crazy...
    Our Prime Minister has multiple scandals worse than anything Trump has done so far... From groping a reporter to multiple instances of blackface to firing the attorney general because she investigated a company who gave bribes to Moammar Gaddafi to nearly a billion dollars going to a 2 person software consulting firm to fake charities giving him and his family members millions of dollars and much more. Yet somehow we're held up as an example of democracy and the US, which defends democracy all over the world somehow isn't...
    
    cdelsolar 9 months ago
    
    it's not crazy to suspend all federal grants, get us out of the Paris accord when the world is rapidly heating up, try to put RFK Jr, a vaccine denier, in charge of our health, rename the goddamn Gulf of Mexico, unconstitutionally order that people born on American soil are not necessarily Americans anymore, which has been the case for at least 150 years, withdraw from the WHO, launch hourly raids to deport people, many of whom are actually American citizens?
    
    dismalaf 9 months ago
    
    [flagged]
    
    rishav_sharan 9 months ago
    
    > Is anyone on pace to meet those targets? Canada isn't.
    India, one of the poorest countries by per capita GDP met its target before deadline. Better than looking at others, US need to ensure that it meets its own targets. Because its the sole super power and at an international level without US leading sustainability, there is no hope for the rest of us.
    
    worik 9 months ago
    
    Be a bit careful of "whataboutism". It is a cognitive fallacy
    Canada's civil society collapsing does not mean that the USA's civil society is not. Perhaps they both are?
    I think that the social fabric in both countries can take the punishment for a decade or two. It will be some time before they collapse.
    
    dismalaf 9 months ago
    
    [flagged]
    
    ulbu 9 months ago
    
    [flagged]
    
    dang 9 months ago
    
    Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
    If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
    
    ulbu 9 months ago
    
    Unsubstantive, sure, I agree, but I don’t see how expressing a sentiment of disbelief and suspicion of trolling (or however it may be called) this way is “flamebait”.
    
    dang 9 months ago
    
    Posting "is this satire" is an internet trope, a way of snarkily putting down what someone else said. In that way, it's a provocation and in that way flamebait.
    If that explanation doesn't land, there are other site guidelines which maybe make the point more directly, such as "Don't be snarky", "Edit out swipes", and/or "Omit internet tropes".
    https://news.ycombinator.com/newsguidelines.html
    
    ulbu 9 months ago
    
    i see, thank you for clarifying
    
    germandiago 9 months ago
    
    > when the world is rapidly heating up
    I think it would do no harm to know that the Earth had a glaciar era (at least, I am not sure it was more than one), 10 times more carbon dioxide than now at some point and that weather always changed in history. All without humans being involved. Do not take as facts the propaganda of "it is humans and only humans who are making the temperature change".
    If I am not wrong the greenhouse gas for carbon dioxide is a small percentage (I do not remember if it was 3%, from which only a fraction is generated by humans). The water vapor (which is a greenhouse gas also) accounts for like 90% or more, by the effect of the sun in the water, and noone talks about this in the media. Also, the most catastrophic models are usually chosen instead of other alternative models.
    There are also multiple reports of how data was deliberately hidden or manipulated: https://www.libertaddigital.com/ciencia/que-es-el-watergate-... (in spanish, some evidence of manipulation about the climate change, which was conveniently renamed from "Global warming" at some point, bc in the last 20 years there has been no such "global warming"). I think there was a record in the mass of ice registered in Antartica in year 2017 if I am not wrong...
    Smells really bad all this story at this point, to be honest.
    
    kelnos 9 months ago
    
    Not sure what you mean; the things you list that your PM has done seems like just another day at the office for Trump.
    Trump has already done quite a few crazy things since his inauguration. And I'm judging that by what I'm hearing from people who are directly affected by his actions, not by what I'm hearing from the media.
    
    germandiago 9 months ago
    
    > the things you list that your PM has done seems like just another day at the office for Trump
    Yet you did not mention one single thing. The thread to which you replied did.
    
    nateglims 9 months ago
    
    Trump is doing things that violate some of the guardrails of the US system. For example, refusing to disburse funds allocated by the legislature. These separations are less of a barrier in parliments. At least in the westminster system. These boundaries have been redrawn before: the federal bureaucracy was basically invented between the 30s and the 60s and SCOTUS's role as we know it was established in the early 1800s.
- AdeptusAquinas 9 months ago
  
  [flagged]
  - dismalaf 9 months ago
    
    [flagged]
    
    Tostino 9 months ago
    
    [flagged]
    
    dang 9 months ago
    
    Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
    If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
    
    FooBarWidget 9 months ago
    
    [flagged]
    
    dang 9 months ago
    
    Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.
    If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
    
    FooBarWidget 9 months ago
    
    All right.
    
    lukev 9 months ago
    
    [flagged]
- intalentive 9 months ago
  
  [flagged]
  - AdeptusAquinas 9 months ago
    
    Agree with most of this (not the weird anti-immigrant bit but the rest), however China does have some foreign requirements that are a bit of a pain in the ass, like its insistence that Taiwan isn't a country. They also don't like it and will retaliate when you point out the shady shit it does (e.g. the Uighurs), but then thats no different from the states especially under its current toddler administration.
    
    dismalaf 9 months ago
    
    > its insistence that Taiwan isn't a country
    Lol Taiwan is officially the "Republic of China", as per their own claims.
    
    intalentive 9 months ago
    
    The immigration bit helps explain why Hungary, for instance, is trying to escape Western orbit and align with the East.
    
    intended 9 months ago
    
    Is Hungary planning to exit the EU?
    
    intalentive 9 months ago
    
    I can't say what Orban is planning specifically, but his recent actions have not been well received by the EU Council, to say the least. Moreover Hungary has been China's main outpost in Europe since 2015 when it joined Belt and Road.
    Considering fresh signs of rupture in transatlantic relations, maybe Orban will turn out to have had keen foresight. There seems to be some sort of realignment afoot under the Trump administration.
    https://en.wikipedia.org/wiki/2024_visits_by_Viktor_Orb%C3%A...
    https://en.wikipedia.org/wiki/China%E2%80%93Hungary_relation...
    
    portaouflop 9 months ago
    
    Orban is a racist, kleptocratic madman that runs a mafia state - trying to apply reason to his actions beyond enriching himself and his lackeys is a fools game
    
    skinnymuch 9 months ago
    
    Every country would behave like China in the same situation with Taiwan. Imagine if the Confederates moved over to Puerto Rico or Hawaii or Alaska. America damn sure would say that’s America still. They’re literally the same people from the same land. Same ethnicity. Same history. Only being apart for under a century.
    
    worik 9 months ago
    
    > They’re literally the same people from the same land. Same ethnicity. Same history. Only being apart for under a century
    I see what you mean. But there is an alternative point of view.
    The indigenous people of Taiwan are very different. DNA wise they are the prototypical polynesians
    
    skinnymuch 9 months ago
    
    From an alternative pov from any one not from the west/colonizer countries:
    We arent talking about those indigenous people regarding this topic. We are talking about the Chinese people there. This would be clear and obvious if Confederates were in any of the examples I gave. All 3 examples have indigenous people now who aren’t cared about now.
    Americans were still actively cleansing Native Americans under 200 years ago. The only country that would do anything serious about an attempt at Chinese reunification would be America [and of course NATO and Europe but if America wasn’t doing anything, Europe wouldn’t either].
    If it wasn’t for America, reunification would have already happened.
    So the pov of Americans or the west caring about indigenous people is faulty from the pov of most of the rest of the world. The west should care about indigenous people in their own direct spheres of influence first.

this15testingg 9 months ago

if you want to completely disregard copyright laws, just call your project AI!

I'm sure Aaron Swartz would be proud of where the "tech" industry has gone. /s

what problem are these glorified AIM chatbots trying to solve? wealth extraction not happening fast enough?

mring33621 9 months ago

We demand immediate government action to prevent these cheaper foreign AIs from taking jobs away from our great American AIs!

bhargav 9 months ago

This is gonna be spun up as a security thing, and banned cozz Murica.
cactusplant7374 9 months ago

To the detriment of OpenAI, the math is going to be used to improve AIs developed in America. And we need to remember that Marc Andreessen is very against government banning maths.
bhouston 9 months ago

> We demand immediate government action to prevent these cheaper foreign AIs from taking jobs away from our great American AIs!
That is exactly what Microsoft and Sam Alman are asking for. And they will likely get it because Trump really likes protectionist governments policies.
- blantonl 9 months ago
  
  It’s funny, the Chinese are here innovating on AI, batteries, and fusion, and here in the United States we’ve pivoted to shitcoins and universal tariffs.
  At least we have the CyberTruck to highlight American greatness
  - mk89 9 months ago
    
    They created an untameable beast (China) thanks to the "cheap factories" there and they thought they would stay that way.
    What a bunch of idiots. The propaganda keeps telling us that they don't invent, they can only copy etc., but clearly that's not true.
- clarionbell 9 months ago
  
  He likes feeling important, just look at TikTok. All it took was bit of sycophancy and he turned into Mr. Freemarket again.
  Really, people need to realize that Trump has never been consistent in any of his political positions, except for one: "You have to look out for number one."
  - bhouston 9 months ago
    
    clarionbell wrote:
    > He likes feeling important, just look at TikTok. All it took was bit of sycophancy and he turned into Mr. Freemarket again.
    Not really. He said that TikTok has to have shift towards US ownership if it wants to continue, he just gave them a 90 day extension to allow that change in ownership.
    
    meiraleal 9 months ago
    
    Which TikTok will have to decline again and shutdown now with the guilty being transferred to Trump. Doesn't sound like a smart move.
Dansvidania 9 months ago

does it matter if the company gets banned? other non-chinese companies can pick up the open source model and run it as a service with relatively low investment, isn't that the point?

sleepbyte 9 months ago

[dead]

breakitmakeit 9 months ago

As the article points out, they are arguing in court against the new york times that publicly available data is fair game.

The questions I am keenly waiting to observe the answer to (because surely Sam's words are lies): how hard is OpenAI willing to double down on their contradictory positions? What mental gymnastics will they use? What power will back them up, how, and how far will that go?

ADeerAppeared 9 months ago

Their way of squaring this circle has always been to whine about "AI safety". (the cultish doomsday shit, not actual harms from AI)
Sam Altman will proclaim that he alone is qualified to build AI and that everyone else should be tied down by regulation.
And it should always be said that this is, of course, utterly ridiculous. Sam Altman literally got fired over this, has an extensive reputation as a shitweasel, and OpenAI's constant flouting and breaking of rules and social norms indicates they CANNOT be trusted.
snakeyjake 9 months ago

When large sums of money are involved the techbros will burn everything down, go scorched earth no matter what the consequences, to keep what they believe they're entitled to.

Lamad1234 9 months ago

[dead]

hammon 9 months ago

[dead]

redder23 9 months ago

[dead]

lngnmn2 9 months ago

[dead]

TheRealNGenius 9 months ago

[dead]

animanoir 9 months ago

[dead]

throwccp 9 months ago

[flagged]

me_me_me 9 months ago

[flagged]

bashtoni 9 months ago

Claude will also often refer to itself as ChatGPT. This is not a 'China' problem.
- me_me_me 9 months ago
  
  Jesus Christ people!
  If you stole my wallet because someone else stole your wallet it DOSE NOT MAKE IT OK.
  Stealing is stealing. Stop excusing it.
  - throwr38274 9 months ago
    
    Your analogy is not correct, since you were the one who stole the wallet in the first place.
    The correct analogy would be, someone took the wallet you stole and gave it back to the person.
dns_snek 9 months ago

[flagged]
- brianstrimp 9 months ago
  
  I work in a field with lots of cheap microchips. I can tell you that the amount of counterfeit copies flooding in from China as well as the speed in which they are copying is truly breathtaking.
- me_me_me 9 months ago
  
  What double standard? I have not claimed anything about openAI being ethical paragon of virtue.
  I anything, you can be accused of using whataboutism in order to justify DeepSeek illegal actions
  - dns_snek 9 months ago
    
    I'm not talking about OpenAI, I'm pointing out the unnecessary sideswipes directed at China whenever they do something bad that the US pioneered.
    It's hard to read those as anything other than casual racism.
    
    me_me_me 9 months ago
    
    > casual racism
    Yeah, casual racism calling out china on stealing every ip out there as its sanctioned by their own government.
    Peak of racism.
- datavirtue 9 months ago
  
  [flagged]

tntxtnt 9 months ago

Can they tax DeepSeek just like they taxed BYD cars? Smh Chinese ruin US industry again and again and again. Where's Trump at?? Why don't he taxed 1000000% of the free $0 DeepSeek AI??

RohMin 9 months ago

this comment section smells like Reddit - ugh

TheJCDenton 9 months ago

This Deep Whining® technique used by OpenAI is not very effective.

aucisson_masque 9 months ago

I don't see the difference between that and LLM feeding on internet people's data.

They call it IP theft yet when the New York Times sued OpenAI and Microsoft for copyright infringement they claimed it's fair use of data.