Less Wrong EBook Creator

byScottL4y13th Aug 20156 comments

45


I read a lot on my kindle and I noticed that some of the sequences aren’t available in book form. Also, the ones that are mostly only have the posts. I personally want them to also include some of the high ranking comments and summaries. So, that is why I wrote this tool to automatically create books from a set of posts. It creates the book based on the information you give it in an excel file. The excel file contains:

Post information

  • Book name
  • Sequence name
  • Title
  • Link
  • Summary description

Sequence information

  • Name
  • Summary

Book information

  • Name
  • Summary

The only compulsory component is the link to the post.

I have used the tool to create books for Living LuminouslyNo-Nonsense MetaethicsRationality: From AI to ZombiesBenito's Guide and more. You can see them in the examples folder in this github link. The tool just creates epub books you can use calibre or a similar tool to convert it to another format.  

Below is an FAQ on how to use it. If you have any other questions, let me know.

FAQ:

How can I quickly get the tool running?

Download the tool from this dropbox link. Copy the jar and the lib folder and then run.

You can use the xlsx files in the examples folder at this github link for input.

You can use the lesswrong.html file in the resources folder at this github link for a cover page. If someone wants to create a better cover page, then I will update it.

How does it work?

The below image is an example of the program running.

Descriptions of each of the configurable options are below:

  • Output File Location - this is where the epub file will be saved
  • Input Data - this is where the excel file that you want to use as input can be found. The excel file should be in a specific format. You can see examples in the Examples folder in the github location for the format.
  • Cover page - this is the location of any html file that you want to use as a cover page. Any image files that this html file uses should be in the same folder.
  • Include comments - this is used to determine whether comments should be included or not
  • Include children comments >= threshold - this determines whether you only want top posts that are greater or equal to the threshold or whether you want to include children posts that are greater or equal to this threshold as well. For example, if the threshold is five then out of the below comments 1 and 3 would be included if this is checked if it is not checked only comment 1 will be.

Comment 1 7 points

 Comment2 3 points

Comment 3 5 points

  • Threshold - only comments which have a point score that is greater than or equal to this threshold will be included
  • Include posts parent - if this is checked than comments 1, 2 and 3 below would be included. This is because comment 3 is greater than or equal to the threshold and comment 1 and 2 are parents of this comment. Only parents of comments greater than or equal to the threshold will be included.

Comment 1 2 points

Comment2 3 points

Comment 3 5 points

  • Include posts children - if this is checked than comments 1, 2 and 3 below would be included. This is because comment 1 is greater than or equal to the threshold the other comments are children of a comment that is greater than or equal to the threshold.

Comment 1 7 points

Comment2 3 points

Comment 3 5 points

What sites can this tool pull posts from

Do you have some example output of the tool?

See the epub files in the examples folder at this github link

The example books had the include comments option checked and the threshold was set to 5.

Do you have some example input data that I can use to create epubs?

See the xlsx files in the examples folder at this github link

Do you have an example cover page that I can use

See the lesswrong.html file in the resources folder at this github link

Where can I download the tool?

Here is the dropbox link to the jar file. Copy the jar and the lib folder and then run.

Where can I download the code?

At this github link

How do I create an input file?

Copy one of the xlsx files in the examples folder at this github link. Update it as appropriate, e.g. change the links to the posts.

Each row in the first sheet defines a post that will be included in the book, its title, a summary to display for this post and which sequence and book it belongs to.

The second and third sheet define the summary to be shown for the sequence and book.

I created or improved some of the input files by adding summaries. Should I share them?

Yes. I haven't written summaries for most of the example excel files. If someone wants to write summaries. then I will then add them to the github link.

I found a problem with the tool or it is not working. What should I do?

Post a comment below and I will look into it.

What does “Parent comment not included” mean?

An example of this is when you have Include children comments >= threshold set and threshold at 5. If you had the below comments, 1 and 3 would be included, but comment 2 would not be. When you include children comments there is normally a link to the parent comment. However, the parent (comment 2) for comment 3 is not included as its score was less than the threshold. Hence, "parent comment below threshold" indicates that the parent comment was not included. If you do want to include comment 2, then you should recreate the book with the include children or parent option selected.

Comment 1 7 points

Comment2 3 points

Comment 3 5 points

Hasn't someone else already done this?

Yes. See here for some examples, but I don't think that included any ability to get comments or summaries.

Why is this a separate GUI and not integrated into less wrong?

I was really just writing this for myself and also based on what was said here, also below, it sounds like it should be separate.

Matthew Fallshaw:

Implementing this in the code doesn't seem to be significantly better than implementing an independent scraper, and it increases the amount of code we have to maintain. I think this is not a desirable feature.