Technology

63186 readers

3659 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

Nvidia released a paper about a 100KB text-to-image model that only trained for 4 minutes but claims to be better than bigger models (research.nvidia.com)

submitted 2 years ago by hayek@feddit.de to c/technology@lemmy.world

9 comments fedilink hide all child comments

They also claim that it only takes about 8 seconds to generate various good images.

top 9 comments

sorted by: hot top controversial new old

[–] etrotta@kbin.social 22 points 2 years ago (1 children)

Might want to clarify: The "model" in this case is not a full model like Stable Diffusion, but rather something used like a patch, more comparable to something like LoRA

I don't think that anyone would misunderstand anyway, but better safe than sorry

[–] astrsk@kbin.social 5 points 2 years ago

That’s the real meat of this. The future of models will be these smaller, focused “patches” that have some kind of traceable lineage. At least when it comes to marketing and selling these.

[–] hoshikarakitaridia@sh.itjust.works 17 points 2 years ago (1 children)

I'm always sceptical about those claims.

Let them prove it, and then we can decide if it's good or not, instead of getting our hopes up for empty promises.

Not the first time ppl have made outlandish claims with AI, even though of course you'd expect someone like Nvidia to be cognisant about this kind of marketing.

[–] zalack@kbin.social 11 points 2 years ago* (last edited 2 years ago)

NVIDIA's marketing overhypes, but their technical papers tend to be very solid. Obviously it always pays to remain skeptical but they have a good track record in this case.

[–] JackGreenEarth@lemm.ee 9 points 2 years ago

Release it, and let us see. Don't just claim stuff.

[–] ghariksforge@lemmy.world 7 points 2 years ago

Where can we download the model?

[–] ubermeisters@lemmy.world 7 points 2 years ago (1 children)

Pretty neat. The training process takes a while for textual inversion, which I have enjoyed playing around with. I hope Automatic1111 gets support for this method of training, if it takes off!

[–] AngrilyEatingMuffins@kbin.social 2 points 2 years ago (1 children)

Can this be adapted to LLMs?

[–] ubermeisters@lemmy.world 3 points 2 years ago

Great question, I wondered the same thing. I've got a decent knowledge base where stable diffusion (text to image etc) is concerned, and understand the applications of this Nvidia process, I'm not familiar enough with customization options for LLMs. I haven't really seen references to hypernetwork/lora/midjourney type applications in LLMs, or anything that really "plugs into" your existing model to augment results, the way stable diffusion is geared for customization. It seems in my limited understanding, that customization for LLMs requires customization of the training ing data, and a completely new training process for the actual model, not a reference model like SD.