Technology

69298 readers

4100 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

432

Pinterest changes user terms so it can train AI on user data and photos, regardless of when they were posted (futurism.com)

submitted 1 month ago by Emperor@feddit.uk to c/technology@lemmy.world

34 comments fedilink hide all child comments

Pinterest has updated its privacy policy to reflect its use of platform user data and images to train AI tools.

A new clause, published this week on the company's website, outlines that Pinterest will use its patrons' "information to train, develop and improve our technology such as our machine learning models, regardless of when Pins were posted." In other words, it seems that any piece of content, published at any point in the social media site's long history — it's been around since 2010 — is subject to being fed into an AI model.

In the update, Pinterest claims its goal in training AI is to "improve the products and services of our family of companies and offer new features." Pinterest has promoted tools like a feature that lets users search by body type and its AI-powered ad suite, which according to Pinterest's most recent earnings report has boosted ad spending on the platform. The company is also building a text-to-image "foundational" AI model, dubbed Pinterest Canvas, which it says is designed for "enhancing existing images and products on the platform."

The platform has stressed that there is an opt-out button for the AI training, and says it doesn't train its models on data from minor users.

...

Soon after we reached out to Pinterest with questions about the update, we were contacted by a spokesperson who insisted that the update wasn't newsworthy because the update simply codifies things Pinterest was already doing. Later, the company provided us with an emailed statement.

"Nothing has changed about our use of user data to train Pinterest Canvas, our GenAI model," read the statement. "Users can easily opt out of this use of their data by adjusting their profile settings."

Pinterest was already training its AI tools with user data, as the company touches on in this Medium post about Canvas, but the practice is now codified in the platform's terms of service.

you are viewing a single comment's thread
view the rest of the comments

[–] kane@femboys.biz 8 points 1 month ago (2 children)

My worry is that these social media alternatives might get scraped by these AI companies as well.

Sure, a company handing it over is much easier (i.e. Reddit). But with the decentralized nature, everyone needs to protect their instances themselves, which I’m not sure how well everyone will be capable of doing that.

Definitely much more difficult, so it’s a step in the right direction.

[–] Womble@lemmy.world 7 points 1 month ago (2 children)

Everything on the Fediverse is almost certainly scraped, and will be repeatedly. You cant "protect" content that is freely available on a public website.

[–] ayyy@sh.itjust.works 6 points 1 month ago* (last edited 1 month ago)

Nuh uh, I wrote an entire license in every one of my comments so it would be impossible for them to scrape! /s

[–] kane@femboys.biz 2 points 1 month ago (1 children)

I do not entirely agree.

While what you said might be true for content that we post, things like view history and tracking in itself is much more difficult. That meta data does help with tagging content.

[–] Womble@lemmy.world 1 points 1 month ago

Yeah, fair enough, I was refering to posts and comments not other metadata because that isnt publicly available just as a get request (as far as I'm aware)

[–] Emperor@feddit.uk 6 points 1 month ago (2 children)

There are lists of bots that instance Admins can block for a range of reasons.

Anything online can be scraped but big firms might run into regulatory trouble if they are caught randomly scraping sites without consent. At the moment, the big social media apps have a tonne of content to train on in tightly controlled conditions, so they don't really need to go into the wild, yet. However, we need to be vigilant, block them and make a fuss if we catch them at it.

[–] CosmicTurtle0@lemmy.dbzer0.com 5 points 1 month ago (1 children)

What's to stop a company from standing up their own instance?

If they only create an admin account and then federate to every instance, now they have everyone's content.

I'm suddenly realizing the anti-AI blurbs people add to their comments now make sense.

[–] JohnEdwa@sopuli.xyz 2 points 1 month ago

IANAL, but the way the federation by necessity copies your posts and information to every instance there is and to be able to do that it all needs to be under a licence that allows it to happen, those blurbs almost certainly are legally entirely meaningless. The only thing I can think of is claiming a non-commercial use violations, but that could put every instance that runs on donations under fire as well.

[–] kane@femboys.biz 2 points 1 month ago (1 children)

That’s a very good shout, I wasn’t aware there are pre existing lists. That’s a great step, and definitely one I will look to add to my own instance.

[–] Emperor@feddit.uk 3 points 1 month ago

We just added it as the old frontend was getting hammered by bots - it helped a lot.