/var/log/rant: Thoughts on ML Techniques to better handle Twitter

2016/09/04

Thoughts on ML Techniques to better handle Twitter

Thinking things through afk and thus not as polished as my normal posts.

Been thinking about grouping my friends (those I follow) strictly by relationship mapping. In part, I haven't done this because I can't read the math in the paper describing it, and in part because there are points where I serve as connecting node between two clusters and they have started interacting independently. I know a joy of Twitter is that it allows people to connect by interest and personality, not geography, but when a programmer in Washington and an activist in Indiana talk food and cats with each other, it makes my "programmer" cluster and my "Indiana" cluster less distinct.

So, what to do?

Topic Modelling.

I know about this via the Talking Machines podcast, and, without mathematic notation, if you take a body of text as a collection of words, the words it contains will vary by subject. If the topic is "politics", the text might contain "vote" and "veto" and "election" and "impeach". If the topic is "football", we'd see "lateral", "quarterback", "tackle" and "touchdown".

Rather than separating Twitter followers into groups simply by interactions, I could start with both certain lists I have curated (and yes, there are both "local tweeters" and "programmer" lists) and hashtags (because if you hashtag your tweet #perl, you likely are talking about Perl) to start to identify what words are more likely to come up when discussing certain subjects, then start adding then to those lists automatically.

If I can work this out 140 characters at a time.

/var/log/rant

Cookie Notice

2016/09/04

Thoughts on ML Techniques to better handle Twitter

No comments:

Post a Comment