Justin Makes a Ruby Program!

I’m helping a friend convert some HTML-formatted posts from his blog into an ePub. I wanted to use the great Markdown editor Ulysses to produce the ePub, because it’s a nice program and because being able to edit in Ulysses would give me some easy control over the formatting of the ePub.

This required two steps.

The first was to get the HTML files into a format suitable for importing into Ulysses and outputting into an ePub.

This was a bit tricky, because I wouldn’t want to just do a straightforward conversion from HTML to Markdown. I needed to retain some HTML so that it could be styled by the CSS sheet I’d use in Ulysses and look correct. So I needed to be able to selectively remove formatting while getting it Markdown-ish enough for Ulysses to be able to output to ePub.

The second step would be to actually output the ePub with the desired styling. This would involve customizing a CSS Style sheet in Ulysses.

Step two was basically trivial — I modified Jennifer Mack’s excellent KBasic style sheet, essentially doing a copy-and-paste (with modification) of some of the relevant CSS stylesheet from my friend’s blog. No prob.

So this post will focus on the first step, the Ruby program and associated gems I used to get the HTML files in shape for Ulysses, since that was the interesting part.

Step by step

This bit lets us work with various Ruby Gems necessary for the project:

I wanted the script to iterate through a whole directory of HTML files. That’s what this next bit does:

The next few steps use Nokogiri, an HTML/XML parser, to pull the information we want from the HTML file we’re currently working with.

This initializes a variable that will let us work with the content of the html file in Nokogiri:

This initializes a variable for the content of the blog post we are gonna turn into an ePub:

In contrast with what I do below on the blog post’s title, I am not using the .text Nokogiri method to extract the text of the article. Why? Cuz i need the HTML formatting of the article for various purposes (to convert to Markdown and to have properly formatted blockquotes). Nokokgiri has a css selector that lets you extract elements from web pages. See how it works here

We want to put the body of the post into a string for further manipulation, so we do that next:

This initializes a variable for the title of the article. We just want the text here, so we’ll use the .text Nokogiri method:

This concatentes the title with a leading “# ” so that it will be a proper Markdown title, which will automate the process of turning blog posts into individual chapters in an ePub:

(The strip method prevents the title from appearing on a separate line than the “#”)

Next we’re gonna combine the post title and body. first we initialize an empty string:

Then we concatenate the strings we’ve got for the title and body, with the title being added first of course:

This next bit puts blockquotes on their own lines, which helps ulysses handle blockquotes correctly when making ePubs:

This converts <h3> tags to markdown appropriate format, which Upmark wasn’t handling well for some reason:

Sanitize “cleans up” html files by removing stuff that’s not on the white list. I had to build my own custom whitelist to get curi blog posts to work correctly. I basically modified an example white list by adding a couple things:

Upmark is a Ruby gem for converting HTML to Markdown.

(NOTE: I’ve manually MODIFIED the version of Upmark ruby gem I’m running (specifically the markdown.rb script) by commenting out the portion that handles <br> tags. Leaving the break tags in, as opposed to replacing them with newlines, keeps the formatting correct inside of blockquotes)

This last bit saves our file to with a markdown extension and closes the loop we opened way up top:

The Script

Here’s the whole script, for reference:

Justin Attempts to Understand Part of “The Fabric of Reality”, Part 3

Summary so far of my attempts to understand David Deutsch’s discussion of Godel’s Theorem in The Fabric of Reality:

So in Part 1 we considered a type of “diagonal argument” applied to DD’s discussion of Cantgotu enviroments. Basically, DD said that a program for a VR generator has to have a discrete, quantized set of values for any variables, expressible as a finite sequence of symbols, and that it can be executed in a sequence of steps. Suppose we have an infinite set of possible VR programs meeting this criteria that some VR generator can run. We could define a second infinite set of programs (the Cantgotu list) in relation to the first by saying that this second set consists of programs whose characteristic is that, in any given minute, the VR environment they produce behaves differently than a VR environment in the first set. So in the first minute, Cantgotu Environment 1 behaves differently than plain old Environment 1, and in the second minute, Cantogu Environment 1 behaves differently than plain old Environment 2, and so on. So ultimately the Cantgotu Environment varies from every environment in the first list, and so this shows we can only run a fraction of logically possible environments.

In Part 2 we talked about the problem situation Godel’s Theorem was facing. Basically, mathematicians liked to believe that their proofs were on absolutely certain foundations. And Godel made this impossible by showing that 1) any set of rules of inference that could validate arithmetic proofs couldn’t validate its own consistency, and 2) a consistent set of rules of inference will fail to show as valid methods of proof which are in fact valid. And DD tells us that Godel used something like the diagonal arg that Cantor used to show there’s bigger infinities than natural numbers and that DD uses when talking about Cantgotu environments.

Basically, with regards to point 2, Godel considered some set of rules for making inferences, then showed how to construct a proposition you couldn’t either prove or disprove under the rules of inference. Then he showed the proposition was in fact true.

I feel like I get the general point but am not sure I could apply explain or apply this idea very well without knowing more technical details.

BTW I watched this video on Cantor’s diagonalization argument which I thought was okay (I tried a few different vids out). I looked for additional material (besides The Fabric of Reality) directly on Godel’s Theorem but it all seemed like it would be above my level of mathematical understanding.

Justin Attempts to Understand Part of “The Fabric of Reality”, Part 2

This post is part 2 of my attempt to understand the discussion of Godel’s incompleteness theorem presented in The Fabric of Reality by David Deutsch (DD). See Part 1 here.

Below is a summary of what I thought the relevant parts of Chapter 10 were. I included a big chunk of the beginning of the chapter cuz it seemed like relevant context to understand the problem situation Godel was facing.

David looks back on his discussion of Cantgotu environments and asks:

I have said that there exist infinitely many [Cantgotu environments] for every environment that can be rendered. But what does it mean to say that such environments ‘exist’? If they do not exist in reality, or even in virtual reality, where do they exist?

DD asks whether abstract, non-physical entities like numbers & laws of physics exist. He wants to distinguish things that have an independent existence vs. things that are features of our culture, arbitrary, etc.

DD says if we want to know whether a given abstraction exists, we should ask whether it “kicks back” in a complex, autonomous way. He talks about natural numbers as an example. We came up with them as an abstract way of expressing “successive amounts of a discrete quantity.” But after we made them, it turns out they have all these properties we have to figure out (like characteristics of the distribution of prime numbers.)

Since we cannot understand them either as being part of ourselves or as being part of something else that we already understand, but we can understand them as independent entities, we must conclude that they are real, independent entities.

DD says abstract entities are intangible and don’t literally kick back. He says proofs play the role in maths experiment and observation plays in science, and says that mathematicians are big on thinking proofs are absolutely certain.

DD talks some about Pythagoras and Plato. Pythagoras thought regularities in nature were expressions of mathematical relationships in natural numbers. Plato thought the physical world wasn’t real and thought things we experience in our world were reflections from the world of Forms (the Forms including numbers like 1,2,3, mathematical operations etc.) DD also mentions Plato’s theory of in-born knowledge.

DD says that mathematicians agree “mathematical intuition” is a source of absolute certainty, but then disagree about what mathematical intuition tells them, heh. He gives the example of imaginary numbers. There were proofs about real numbers that involved imaginary numbers, and some mathematicians objected because they said imaginary numbers weren’t real. Good example.

DD talks about a crisis in mathematics. Aristotle had figured out the laws of logic and syllogisms, and said that all valid proofs could be expressed as syllogisms. He hadn’t proved this though. And DD says the crisis was modern mathematical proofs weren’t expressed syllogistically and involved tools outside the classical logic rules. So this proved that Aristotle’s rules were inadequate. And people were worried that maybe the new stuff wasn’t absolutely certain.

DD talks about responses to this crisis. One was intuitionism, which tries to construct intuition in a super narrow way only based on supposedly unchallengeable, self-evident aspects. This leads to them denying infinite sets.

DD says intuitionism had some value (like inductivism) in that it dared to question some received certainties. But it ultimately involved retreating into an inner world/domain that actually makes explanation of even that inner domain harder. E.g. intuitionists deny infinities. So there must be finite natural numbers. How many? And then why can’t you form an intuition about the next natural number above that one? Intuitionists reply to this by denying logic, specifically the law of the excluded middle, which says there’s no third possibility between a given proposition and its negation.

DD says:

…by severing the link between their version of the abstract ‘natural numbers’ and the intuitions that those numbers were originally intended to formalize, intuitionists have also denied themselves the usual explanatory structure through which natural numbers are understood. This raises a problem for anyone who prefers explanations to unexplained complications. Instead of solving that problem by providing an alternative or deeper explanatory structure for the natural numbers, intuitionism does exactly what the Inquisition did, and what solipsists do: it retreats still further from explanation. It introduces further unexplained complications (in this case the denial of the law of the excluded middle) whose only purpose is to allow intuitionists to behave as if their opponents’ explanation were true, while drawing no conclusions about reality from this.

DD then turns to Godel:

Thirty-one years later, Kurt Godel revolutionized proof theory with a root-and- branch refutation from which the mathematical and philosophical worlds are still reeling: […] Godel proved first that any set of rules of inference that is capable of correctly validating even the proofs of ordinary arithmetic could never validate a proof of its own consistency. […] Second, Godel proved that if a set of rules of inference in some (sufficiently rich) branch of mathematics is consistent (whether provably so or not), then within that branch of mathematics there must exist valid methods of proof that those rules fail to designate as valid. This is called Godel’s incompleteness theorem. To prove his theorems, Godel used a remarkable extension of the Cantor ‘diagonal argument’ that I mentioned in Chapter 6. He began by considering any consistent set of rules of inference. Then he showed how to construct a proposition which could neither be proved nor disproved under those rules. Then he proved that that proposition would be true.

So basically Godel showed that we can have no fixed way of knowing if a mathematical proposition is true.

Skipping ahead some in the chapter…

DD says one of Godel’s assumptions was that a proof can have only a finite number of steps. DD says this is correct as far as we know according to quantum theory.

DD says Godel inherited Greek conception that a proof was a particular type of object (a sequence of statements that obey rules of inference.) But really it’s better thought of as a process. He says that with the “classical theory of proof or computation” this distinction isn’t hugely significant because you can just make a record of the process and therefore have a proof “object”. But with quantum computer calculations lots of stuff is happening in other universes so you can’t record all that stuff. And this shows that the old traditional mathematical method of trying to have totally certain stuff by stripping away every source of ambiguity and error isn’t viable.

In the next part I’ll try and sum up.