Thanks for sharing these details. I'd like to chat more about it if you'll bear with me:
Last I checked I was "tense". Is it theoretically possible to explain why your models came to that conclusion? Is there a theoretical data view that could say "here's the comments that contributed to you being considered "tense" (or even better: specific parts of the comments).
From my perspective, without a way to concretely back up any conclusion made, I simply cannot consider the complex-sounding systems to be any better than RNG.
I alluded to this in my first comment, but I think it's imperative that if you're going to explicitly call out what mood you think someone was/is/will be, you need to back it up with "why" so that, at the very least, someone can say, "that's clearly an unfair representation of my mood and here's where the algorithms messed up."
I'm going to go on a bit of a limb and assume that you're just having some fun and exploring an interesting feature. And I think that's really quite cool. But I think we're getting into a software engineering ethics realm that I'm ill-equipped to dig in to. You've got a site that applies "moods" to individuals in a way that doesn't appear to be verifiable. And even if they are, and are accurate, is that even the right thing to do?
I don't want to ruin your fun or twist your arm into changing this. I'm just responding to my red-alert alarm that lives in my stomach. It's set off by the discovery that one of my few online accounts where I don't hide my identity has been assigned a mood label.
First, this conversation is not going to be comforting and your red-alert alarm should be going off. I know this is building toward the dystopian future. I personally use Signal, put my cell phone in the microwave when I come home, and am writing this through a VPN.
That being said, I'm 100% confident that Google, Apple, Amazon, and more know WAY more about you than I do. This system only looks at public data and only links back to usernames. Strangely enough, it can identify users across sites (even without knowing your identity) - say Reddit and Hacker News. It was originally built for https://projectpiglet.com/ to find company insiders, and make trades (which works ridiculously well btw).
There should be no expectation of privacy.
> You've got a site that applies "moods" to individuals in a way that doesn't appear to be verifiable. And even if they are, and are accurate, is that even the right thing to do?
The short answer is, I don't know. I don't want this to exist, but it does. I also know that I have already built systems for companies doing this and more. Albeit the system here is much more of a "turn-key" solution and its patent-pending for what it's worth (I'm also against patents and donate to the EFF, but feel I can potentially segment off this market). Still, lots of people are looking to build systems such as this. Personally, I'd rather be the one guiding the ship, because I do ask myself the same questions you're asking.
I'll be honest... I've used this system to show my friends that even when they change usernames I can still identify them. That particular system (which looks at speech patterns) is turned off here. That's in part because of those questions. However, I'm 100% confident, others are doing that right now, with this conversation. We know that because Snowden shared it with us.
Till recently, my primary account was a certain prominent account on HN. I was wondering if you'd be willing to email me the name of it.
I would be shocked and impressed, and it would serve as a great demo.
(I totally believe you. I'm just fascinated.)
Basically, I'm volunteering to be a lab rat.
Sent
Wow. Cheers for one of the coolest things I've ever seen on HN.
(The algorithm identified not one, but both of my former alts.)
Edited to tone down the reaction, but that's amazing.
lol And now, I'll promptly shut down that system again...
As I said in the previous comment, that IMO is going too far. Even though I'm 100% sure others are using similar systems, I don't feel comfortable in that business -_-
Kinda highlights, although the hnprofile.com demo may have it's current faults - there's a lot you can do. The cool part, is the platform (called Metacortex) is easy to build apps on top of.
Fascinating! Have you considered using the tool to try identify who wrote the recent NY Times anonymous piece? (Assuming you can find enough source material from whitehouse staff)
Ok I've got to ask - can you PM me any of my old or alternative usernames across the web?
I don’t know what for yet, but I’m absolutely certain I will need to pay for your product. Quite the demo!
If you’re ever in the Chicagoland area, happy to buy you a beer!
If you ever need to work on insider threat detection, let me know
Hey, I’m in Chi too! Right by North and Clybourn’s redline stop. If you ever want to reverse engineer this together or figure out how to train a drone to fly itself, drop an email / keybase.
> Strangely enough, it can identify users across sites (even without knowing your identity) - say Reddit and Hacker News.
I think a larger demonstration could make for a very meaningful "Show HN" with hopefully a large impact on privacy consciousness.
I'm sure some of us expected this, but to have it demonstrated by a single person in a casual "Oh, it could do this as well, but I don't have it turned on for ethical reasons" manner is quite effective.
Maybe it will raise awareness that "big data" isn't just trying to correlate cat-pictures one posts on their communication medium to cat food advertising, but everything you write anywhere on the Internet, even on different media under different (or no) accounts, into permanent and guarded profiles with no recourse of opting-out of the machine, effectively rendering informal discourse over the Internet dead.
This is very odd. How did it happen that you're the ship-steerer for this highly cutting-edge tech? Where did the research come from? Usually, this kind of stuff doesn't just pop out of the ether, and I've never seen anything quite so promising in this domain before. (Maybe I just haven't been paying attention.)
> Personally, I'd rather be the one guiding the ship, because I do ask myself the same questions you're asking.
I'm not necessarily the one guiding the ship, but I'd like to be (by being the largest on the market), and I'm just working to launch this as a business (called Metacortex). Fact is, I've spent years building this out for my trading platform. Most of the research is my own, with some contributions from open source and academia as it relates to NLP. That's why I submitted a patent on it, I want to corner this particular (and seemingly highly effective) method.
That being said, I've worked on products with similar goals, and have done technical reviews of WAY creepier products. Luckily, those other products rarely work at all. IMO the technology isn't quite there yet (outside of some of what I demoed), however, it's uncomfortably close.
Snowden told us this was happening, years ago. I'm sure they've improved since.
> That's why I submitted a patent on it, I want to corner this particular (and seemingly highly effective) method.
Which (turns out one of my areas of interest is "which") really only gives you a monopoly on selling the method but not on internal usage by less scrupulous actors...at least that's my opinion, it would be near impossible to enforce a patent on non-public facing code used to track people in BigCorp.
I do like the idea though, it's not like anyone (reasonably) should believe they have any semblance of anonymity on the internets and this just goes to show how easy it is to follow you around on the webs. If an individual can pull this off then the sky's the limit for the TLAs with massive budgets and computing horsepower.
...so...you feel tense? I feel tense reading this. But yes, I wouldn't like to read that either.
Ok, now I've waded in, I should do something to help. Reads the first few pages of your comments.. hmm Nah, you don't seem tense. All kinds of emotions. It's not you, it's HNProfile. Tries to think why it might say that.. There were a few with emotional words, frustration with situations..probably slightly more negative-emotion words than positive.. hmm I think maybe you just write better than most people, more vividly, mostly about serious topics. And are engaged with them. I don't know if it helps, but I say That's clearly an unfair representation of your mood. (I don't know you, wasn't paid by you)