Fascinating stuff. I'd love to mess around with Hubski text in this way. It would be interesting to relate users based upon the words they write. kleinbl00, this might be a compelling way to relate and suggest tags, not only based upon tags, but upon the text associated with a tag. Theoretically, if we had enough RAM, something like this could replace tags altogether. Posts would be grouped by word vectors alone.
All I know is I read it, shared it, and subscribed to #machinelearning. My big programming trick of the day was copy-pasting javascript into a Google spreadsheet and then I broke that. When I see something that says "it will take a multi-core workstation several days to parse the equivalent of 1000 books or 5 million twitter comments" I go "I'll bet this is useful to somebody else."
Well, they do warn in the end that it requires a lot of words in the case of specialized topics. It might not perform too well on hubski-specific tags (i.e. linking thehumancondition to writing and philosophy or vaguequestionsbynowaypablo to askhubski). How does the current related tags method work? Association of tags being used together?
Why bother with tags at all? Just throw the entire posts in, see what comes out. Actually, that'd probably result in unusably large vectors and more data than Google could deal with. It'd be cool though...
It was part of Monday's meeting. Truth be told, we aren't going to be able to quit our day jobs with the level of donations/subscriptions that we will currently muster, and that is part of the reason I have held off. That said, it would be nice not having to dig into our own pockets for servers and stickers. I cannot imagine a VC scenario that would work out well for the site. If Fred Wilson really wanted to give us $10M, it would be difficult not to consider trying to make it work, but I don't think that cash money is what Hubski ought to be after. In my mind, Hubski is about expanding a dimension of human interaction that could benefit from the effort. That goal can easily extend beyond what exists here and now, but I am not keen on selling myself or you all on a new strategy for Hubski that fits a funding model. Ideally, I would like to get the site into a state where is sustained itself, and can continue its mission with minimal polluting influences. I do think it is possible, and I think that it will be even more possible as time passes. The team seems to think that there won't be much harm in an early revenue model; we are going to experiment with that this year.
NPR has evergreen partners (a set amount comes out of my checking account every month). Hubski could do the same. Think givemarkadollar.com, but ongoing. No - you still couldn't quit your day job, but that could provide a steady, reliable stream of server/sticker opex. People feel good about contributing to causes they love.