Historical Ignorance

In commenting on recent violent protests, Daniel Greenfield says:

But nobody is that historically ignorant that they don’t know what a red flag with a hammer and sickle or a swastika stands for.

Theodore Dalrymple, author of Life at the Bottom: The Worldview That Makes the Underclass, disagrees! Some quotes

A considerable number of the auto-tattooed inject themselves with swastikas. At first I thought this was profoundly nasty, a reflection of their political beliefs, but in my alarm I had not taken into consideration the fathomless historical ignorance of those who do such things to themselves. People who believe (as one of my recent patients did) that the Second World War started in 1918 and ended in 1960 – a better approximation to the true dates than some I have heard – are unlikely to know what exactly the Nazis and their emblem stood for, beyond the everyday brutality with which they are familiar, and which they admire and aspire to.

I cannot recall meeting a 16-year-old white from the council estates that are near my hospital who could multiply nine by seven (I do not exaggerate). Even three by seven often defeats them. One boy of 17 told me, ‘We didn’t get that far.’ This after 12 years of compulsory education (or should I say, attendance at school). As to knowledge in other spheres, it is fully up to the standards set in mathematics. Most of the young whites whom I meet literally cannot name a single writer and certainly cannot recite a line of poetry. Not a single one of my young patients has known the dates of the Second World War, let alone of the First; some have never heard of these wars, though recently one young patient who had heard of the Second World War thought it took place in the 18th century. In the prevailing circumstances of total ignorance, I was impressed that he had heard of the 18th century. The name Stalin means nothing to these young people and does not even evoke the faint ringing of a bell, as the name Shakespeare (sometimes) does. To them, 1066 is more likely to mean a price than a date.

My patient was intelligent but badly-educated, as only products of the British educational system can be after eleven years of compulsory school attendance. She thought the Second World War took place in the 1970s and could give me not a single correct historical date.

A few days earlier I had met a publisher for lunch, and the subject of the general level of culture and education in England came up. The publisher is a cultivated man, widely read and deeply attached to literature, but I had difficulty in convincing him that there were grounds for concern. That illiteracy and innumeracy were widespread did not worry him in the least, because – he claimed – they had always been just as widespread. (The fact that we now spent four times as much per head on education as we did 50 years ago and were therefore entitled to expect rising rates of literacy and numeracy at the very least did not in the slightest knock him off his perch.) He simply did not believe me when I told him that nine of ten young people between the ages of 16 and 20 whom I met in my practice could not read with facility and were incapable of multiplying six by nine, or that out of several hundreds of them I had asked when the Second World War took place, only three knew the answer.

Justin Makes a Ruby Program Part 2

Elliot Temple replied to my last post with several good comments and I wanted to address them.

Elliot Temple’s comments have a yellow background:

I would write:

You’re mutating state for no reason and making it longer.

Great suggestion, thanks!

For sanitizing, shouldn’t you keep div and span tags? I’ve used those for formatting sometimes (offhand I know some blockquotes use divs). Also shouldn’t you allow hr tags? I use those.

All good suggestions, thanks. I don’t really have a good understanding of HTML and CSS yet so I’m not good at judging things like what’s important to include in a sanitize list. I was mostly judging by noting that the output files looked good enough, but the hr tag in particular helps with a problem I’d noticed on some of the pages so thanks for pointing that out.

The <hr> tag was not agreeable to the Upmark gem, so I added the following to handle that issue

Also shouldn’t you convert h1 and h2 tags similar to the h3?

Yeah it doesn’t hurt to do them all!

Elliot also suggested I handle the RegEx for header tags by just doing substitution instead of using a block. This is what I came up with for the header issue with Elliot’s help. (my syntax highlighter isn’t handling
#{} well so ignore the highlighting on this code snippet):

And I don’t understand why the sanitizer doesn’t have b, i, u, em, strong on the list.

Fixed thanks.

Thanks for your help Elliot!

With regards to code in blog comments, Markdown has been activated in the comments, and it plays nicely with my syntax highlighting plugin! (I’m using Crayon if anyone is curious)

Inline code syntax is `back-ticks like this`

and which produces code styled like this

and for blocks of code


its a fence of three backticks around your code like this


Justin Makes a Ruby Program!

I’m helping a friend convert some HTML-formatted posts from his blog into an ePub. I wanted to use the great Markdown editor Ulysses to produce the ePub, because it’s a nice program and because being able to edit in Ulysses would give me some easy control over the formatting of the ePub.

This required two steps.

The first was to get the HTML files into a format suitable for importing into Ulysses and outputting into an ePub.

This was a bit tricky, because I wouldn’t want to just do a straightforward conversion from HTML to Markdown. I needed to retain some HTML so that it could be styled by the CSS sheet I’d use in Ulysses and look correct. So I needed to be able to selectively remove formatting while getting it Markdown-ish enough for Ulysses to be able to output to ePub.

The second step would be to actually output the ePub with the desired styling. This would involve customizing a CSS Style sheet in Ulysses.

Step two was basically trivial — I modified Jennifer Mack’s excellent KBasic style sheet, essentially doing a copy-and-paste (with modification) of some of the relevant CSS stylesheet from my friend’s blog. No prob.

So this post will focus on the first step, the Ruby program and associated gems I used to get the HTML files in shape for Ulysses, since that was the interesting part.

Step by step

This bit lets us work with various Ruby Gems necessary for the project:

I wanted the script to iterate through a whole directory of HTML files. That’s what this next bit does:

The next few steps use Nokogiri, an HTML/XML parser, to pull the information we want from the HTML file we’re currently working with.

This initializes a variable that will let us work with the content of the html file in Nokogiri:

This initializes a variable for the content of the blog post we are gonna turn into an ePub:

In contrast with what I do below on the blog post’s title, I am not using the .text Nokogiri method to extract the text of the article. Why? Cuz i need the HTML formatting of the article for various purposes (to convert to Markdown and to have properly formatted blockquotes). Nokokgiri has a css selector that lets you extract elements from web pages. See how it works here

We want to put the body of the post into a string for further manipulation, so we do that next:

This initializes a variable for the title of the article. We just want the text here, so we’ll use the .text Nokogiri method:

This concatentes the title with a leading “# ” so that it will be a proper Markdown title, which will automate the process of turning blog posts into individual chapters in an ePub:

(The strip method prevents the title from appearing on a separate line than the “#”)

Next we’re gonna combine the post title and body. first we initialize an empty string:

Then we concatenate the strings we’ve got for the title and body, with the title being added first of course:

This next bit puts blockquotes on their own lines, which helps ulysses handle blockquotes correctly when making ePubs:

This converts <h3> tags to markdown appropriate format, which Upmark wasn’t handling well for some reason:

Sanitize “cleans up” html files by removing stuff that’s not on the white list. I had to build my own custom whitelist to get curi blog posts to work correctly. I basically modified an example white list by adding a couple things:

Upmark is a Ruby gem for converting HTML to Markdown.

(NOTE: I’ve manually MODIFIED the version of Upmark ruby gem I’m running (specifically the markdown.rb script) by commenting out the portion that handles <br> tags. Leaving the break tags in, as opposed to replacing them with newlines, keeps the formatting correct inside of blockquotes)

This last bit saves our file to with a markdown extension and closes the loop we opened way up top:

The Script

Here’s the whole script, for reference: