vpetukhov

https://github.com/VPetukhov/

Effective Altruism Denmark

Comments

Interactive exploration of LessWrong and other large collections of documents

Great job! I'm going to add this functionality, probably the next couple of weeks. But that will work only for individual articles. If you want to do this massively, I can run the script on your spreadsheet and send you the results. Would you be interested?

Regarding posting it on the EA forum, I want to run some investigation to show insights, relevant to EA. But that requires time, so I postpone posting...

Interactive exploration of LessWrong and other large collections of documents

Sorry for the confusion. The things I listed are not programming libraries, but text editor extensions. VSCode is a very fancy text editor, and the extensions I listed are part of the Foam ecosystem, which allows nice ordering of notes and navigation across them. So if you're interested, I'd suggest to check out those, as well as the recommended extensions list. May be helpful for your problem by itself.

Interactive exploration of LessWrong and other large collections of documents

Do you plan on developing a version of this which could be downloaded and used locally on one's own computer, or does it require too much computing power for that to be viable?

Partially, yes. Transferring the whole tool into a usable portable app is a lot of troubles, especially the visualization is not so transferable now. So, I mainly think about publishing a piece of code that could be integrated into some existing note systems. It would be relatively easy to extract the part that goes through all documents and runs the clustering and embedding. Then, let's say, it could add an yaml heading with tags to each of the notes in the format, compatible with the Nested Tags VSCode extension. In theory, we could also adjust the graph visualization extension to show the overview of notes, but it would be trickier. Would it be what you need?

Still need to say that I'd expect that the methods should be tuned to be applicable to notes, so it may not work out of box. In terms of computational power, it shouldn't be a problem, unless you have 10+ thousands of notes.

Could your tool be used as part of such a thing?

Thanks for the detailed explanation! It's quite close to the ideal outcome as I see it. However, the unbiased text summary part is close to impossible on the current level of technologies (to my knowledge). Maybe in several years. Then, the map of events requires a good way of extracting facts from the text. I really want to play with it at some point, but it could take a while. But we will be moving in this direction.

Interactive exploration of LessWrong and other large collections of documents

Thanks! I can upload the table with cluster and tag labels per post if needed. From this mapping, answers should be relatively straightforward.