A word is worth a thousand vectors - Word2Vec, and why you should care

a thoughtful web.

commentspostsbadges

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

A word is worth a thousand vectors - Word2Vec, and why you should care

A word is worth a thousand vectors - Word2Vec, and why you should care · 25

Dendrophobe · 3545 days ago

stitchfix.com · #machinelearning · #computers

tweet · htmlmarkup tips · 0

mk · 3544 days ago · link ·

Fascinating stuff. I'd love to mess around with Hubski text in this way. It would be interesting to relate users based upon the words they write.

kleinbl00, this might be a compelling way to relate and suggest tags, not only based upon tags, but upon the text associated with a tag. Theoretically, if we had enough RAM, something like this could replace tags altogether. Posts would be grouped by word vectors alone.

+discuss+discuss

–

kleinbl00 · 3544 days ago · link ·

All I know is I read it, shared it, and subscribed to #machinelearning. My big programming trick of the day was copy-pasting javascript into a Google spreadsheet and then I broke that. When I see something that says "it will take a multi-core workstation several days to parse the equivalent of 1000 books or 5 million twitter comments" I go "I'll bet this is useful to somebody else."

+discuss+discuss

OftenBen · 3544 days ago · link ·

Posts would be grouped by word vectors alone.

Damn.

+discuss+discuss

thundara · 3544 days ago · link ·

Well, they do warn in the end that it requires a lot of words in the case of specialized topics. It might not perform too well on hubski-specific tags (i.e. linking thehumancondition to writing and philosophy or vaguequestionsbynowaypablo to askhubski).

How does the current related tags method work? Association of tags being used together?

+discuss+discuss

–

Dendrophobe · 3544 days ago · link ·

Why bother with tags at all? Just throw the entire posts in, see what comes out.

Actually, that'd probably result in unusably large vectors and more data than Google could deal with.

It'd be cool though...

+discuss+discuss

–

thundara · 3544 days ago · link ·

Back in the day I suggested fetching and scanning the content from links / trying to mine tags from them. It still stands (especially for self posts), but had forgotten about it until your reply

+discuss+discuss

mk · 3544 days ago · link ·

How does the current related tags method work? Association of tags being used together?

Yes. Going back a ways, but not through all posts.

+discuss+discuss

user-inactivated · 3544 days ago · link ·

Posts would be grouped by word vectors alone.

I'm no expert, but couldn't this get out of hand rull fast? All I know is that Evernote tries to do something like this, and it's wrong almost 90% of the time.

+discuss+discuss

–

mk · 3544 days ago · link ·

Oh absolutely. It's way beyond our pay grade. Expect that in a Hubski Update in 2021.

+discuss+discuss

–

user-inactivated · 3544 days ago · link ·

Sell out, Mmkay. Get that sweet, sweet Venture Capitalist Money.

Or, you know, add a donate button. Please. Please.

+discuss+discuss

–

mk · 3544 days ago · link · ·

It was part of Monday's meeting. Truth be told, we aren't going to be able to quit our day jobs with the level of donations/subscriptions that we will currently muster, and that is part of the reason I have held off. That said, it would be nice not having to dig into our own pockets for servers and stickers.

I cannot imagine a VC scenario that would work out well for the site. If Fred Wilson really wanted to give us $10M, it would be difficult not to consider trying to make it work, but I don't think that cash money is what Hubski ought to be after. In my mind, Hubski is about expanding a dimension of human interaction that could benefit from the effort. That goal can easily extend beyond what exists here and now, but I am not keen on selling myself or you all on a new strategy for Hubski that fits a funding model.

Ideally, I would like to get the site into a state where is sustained itself, and can continue its mission with minimal polluting influences. I do think it is possible, and I think that it will be even more possible as time passes. The team seems to think that there won't be much harm in an early revenue model; we are going to experiment with that this year.

+discuss+discuss

–

user-inactivated · 3544 days ago · link ·

Hell yeah! haven't even been here that long and I can already tell 2015 is gonna be the hypest year of Hubski.

+discuss+discuss

lil · 3544 days ago · link ·

I love that this "thoughtful" comment about Hubski directions is hidden in this obscure feed.

+discuss+discuss

steve · 3543 days ago · link ·

NPR has evergreen partners (a set amount comes out of my checking account every month). Hubski could do the same. Think givemarkadollar.com, but ongoing. No - you still couldn't quit your day job, but that could provide a steady, reliable stream of server/sticker opex. People feel good about contributing to causes they love.

+discuss+discuss

veen · 3544 days ago · link ·

I wonder, would a small ad box above the feed make a significant difference? Personally I wouldn't mind that at all.

+discuss+discuss

–

mk · 3544 days ago · link ·

I only want to entertain it if the benefit is significant. At this time, I don't think it would be.

+discuss+discuss

user-inactivated · 3544 days ago · link ·

insom and I talked about this once and she seemed to think that it would hardly make a difference financially. Not enough pageviews etc.

+discuss+discuss

Dendrophobe · 3544 days ago · link ·

It's a bit technical, but it may be of minor interest to lil given her recent post on algorithmic text.

+discuss+discuss

–

lil · 3544 days ago · link ·

This article gives me an idea of what algorithmic writing is all about. Thanks -- does your name mean tree-fearing?

+discuss+discuss

–

Dendrophobe · 3544 days ago · link ·

Yes - a bit of an inside joke from my last job. I rather like biological trees :)

+discuss+discuss

–

galen · 3544 days ago · link ·

Is that a trees/trees joke?

+discuss+discuss

–

Dendrophobe · 3544 days ago · link ·

I'm not sure.

If you're asking if it's a marijuana reference (only other common meaning for trees that I'm aware of), then no. It's due to my (alleged) aversion to trees. EDIT: It occurs to me that you're probably asking about my 'liking biological trees' comment. I'm referring to the large plants made out of wood, there.

Basically, I wrote some really slow, difficult to maintain code that operated on very, very large trees. I still work with many of the same people at a new company, and I still haven't heard the end of it.

+discuss+discuss

–

galen · 3544 days ago · link ·

So it is trees/trees, in that you like biological trees, but (allegedly) not data-structure trees? Awesome.

+discuss+discuss

veen · 3544 days ago · link ·

I found it quite interesting. If I recall correctly, Google uses the same method to build its Knowledge Graph. They use a similar vector technique to figure out what the subject of a photo is, for example.

+discuss+discuss

–

Dendrophobe · 3544 days ago · link ·

It's just completely mind-blowing to me that it works at all. It's like

"Hey, here's a bunch of words and a rule for turning them into lists of numbers! If you do math on the numbers and convert them back to words, the results make sense!"

How bizarre is that? The whole "king - man + woman = queen" example is just insane.

+discuss+discuss

privacy & terms