An Introduction to Brute-force and Elegance

I’m gonna dive right in with problem 1 from Project Euler. I’ll give you enough time to go make a cup of tea and make yourself comfortable. Go on, I’ll wait.

So, we meet again. Before we start, a quick disclaimer: maybe a third of the people I know, are already familiar with the programming language R. For people who are not, I will explain as I go.

To make the code easy to read, I took screen captures, to keep the formatting and colours. Since you can’t copy & paste that, you can find all my code for this post here.

Problem 1 states:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000.

1. Unhand me, you brute!

What’s the first thing you try? Well, you figure there’s only 999 numbers to check, so you might as well make a brute-force approach, and check every one. So here is the code in R, where n1 is 3, n2 is 5, and thresh(old) is 1000:

Pic2

Pretty simple, right? All you’re doing is going through the numbers 1 to 999 [ for (i in 1:(thresh-1))]. Then checking for divisibility by 3 or 5:

Pic1

If it is true (thus that ‘if’ above), we add/append that particular ‘i’ to our vector ‘multvect’. After we’ve gone through all possible values of ‘i’, we sum up all the elements in our vector. Voila, our (correct) solution is 233,168. Simples!

The big ‘but’. (I cannot lie). Checking every number for divisibility is fine for small numbers, buuuuuuuut: what about extremely large numbers? Well, let’s have a look. Time taken to run this code: For 1:1000 = 0s ; 1:10,000 = 0.09s ; 1:100,000 = Pic3

Well, that’s embarrassing. Integer overflow? Long story short, that ‘append’ function is like adding more and more people to an orgy. An orgy in your computer memory space. (That’s why it’s called RAM). Sooner or later, someone’s gonna fall off the bed/dinner-table/ski-lift. So we’re gonna need a bigger boat. Or failing that, let’s get elegant.

2. Vectorization at Tiffany’s

The problem with ‘append’ is that you don’t know ahead of time how long that vector is, so you keep adding to it. But because of how your computer works, it might hit a limit at some point. What if we knew ahead of time, how long that vector needs to be? But how would we…¬†Wonder Twin Powers Activate! Form of: Mathematical Reasoning!

We’re just looking for multiples of 3 and 5, below 1000. We know that for 5, the sequence starts 5, 10, 15, … but what’s the largest multiple of 5, below 1000? Sure, we know on intuition that it’s 995, but how do we formalize this? All we need to do is find out how many times 5 fits into 999. So 999/5 is 199.8. But we’re interested in whole numbers, so we round down. Sure enough 199 times 5 is 995. Sorted!

So we just need to preallocate space for 199 numbers in the case of multiples of 5. For 3 its the same process. 999/3 is 333, so we only need 333 numbers. We can simplify the whole process by a neat trick that R can do, called ‘vectorization’. Instead of performing an operation (e.g. multiplying it by 3) one at a time, we can set up the whole vector (1:333, highlighted) and multiply the whole vector by 3.

Pic6

Swish!

Pic5

And our solution is 266,333. Which – wait, what the actual fuck? We should get 233,168! Where did that extra 33,165 come from? Isn’t it interesting how that number is divisible by 3 and 5? How many more rhetorical questions can I ask? Let’s take a look at, say, the first 20 values of multn1 and multn2, our vectors of multiples of 3 and 5, and work out what’s going on:

Pic7

Of course! There’s gonna be numbers which occur in both those lists. Numbers divisible by 3 and 5. Obviously, those numbers are multiples of 3*5 = 15! Sooo stoopid, Herc. Let’s try that again, but only count those values once, by adding this cheeky little line just before the line which returns the answer:

Pic8

‘Unique’ just finds the unique values, ignoring how many times they appear. So unique(2,3,3,3,3,4,4,5) returns 2,3,4,5. ‘c’ means ‘concatenate’ which is fancytalk for stitch-together. Running this function, we get: 233,168!

Now let’s look at how fast it is. brute.force took 0.09s to search through 1:10,000. Holly.Golightly does 1:10,000 in 0s ; 1:100,000 = 0s ; 1:1,000,000 = 0.1s. Niiice! Kinda curious about when it slows down. Let’s keep going: 1:10,000,000 = 1.45s. Hmm. How about 100 million?

Pic9

Fuck. The ole’ girl served us well, but maybe vectors are not the way to go for absurdly large numbers. Let’s acquit ourselves by droppin’ some mad science.

3. Two men vector, one man leaves!

Alright. All we’re really doing is adding numbers up. It shouldn’t be that hard. Let’s look at the multiples of 5 (a nice round number) and see if we can find some patterns. Let’s just take the first 10 multiples, and then we can generalize to larger numbers once we’ve found a nice rule.

5+10+15+20+25+30+35+40+45+50 = 275. How do we make this expression easier? Well, every element is a multiple of 5 (duh), and we remember our trick from above, so this becomes 5*(1+2+3+4+5+6+7+8+9+10). That looks simpler. We can always multiply by 5 after. Okay. And 1+2+3…+10 has a clear pattern to it. Take a look visually:

Pic10

Well, heck! It’s a frikkin’ triangle. With two equal sides. That ain’t so daunting. A triangle’s pretty much half of a diagonally cut square sandwich. (Not sepia-toned. Fuck off). Squares and halves are easy to deal with, so let’s make a sandwich:

Pic10

Well, fuck me sideways. How delightfully common. All those numbers to add up are just half of an almost square. If you work it out manually, 1+2+3+…+10 = 55. We no longer have to do that shit! 1+2+…+n is now just (n x (n+1))/2. Let’s try it. (10*11)/2 = 110/2 = 55.

Now here comes the cheeky bit. We want the sum of (3,6,9,…,999) + the sum of (5,10,15,…,995) minus the sum of (15,30,45,…,990). Fuck code, let’s get out the pencil and paper, and show these sandwiches a thing or two, old-school stylee.

  • 3*(1:333) = 3*(333*334)/2 = 3*(111,222)/2 = 333,666/2 = 166,833
  • 5*(1:199) = 5*(199*200)/2 = 5*(39,800)/2 = 199,000/2 = 99,500
  • 15*(1:66) = 15*(66*67)/2 = 15*(4,422)/2 = 66,330/2 = 33,165
  • 166,833 + 99,500 = 266,333 (oh, hi again! how have you been?)
  • 266,333 – 33,165 = 233,168
  • BAM! And the code:

Pic11

How fast is it? Well, it doesn’t need to allocate any memory space to vectors. It doesn’t need to create vectors at any point. It just finds three numbers, and applies the formula:

Pic12

Yeah, that’s what I thought. Keep kickin’ ass and taking names, math. We salute you.

Advertisements
This entry was posted in Problem 1, R and tagged , , , . Bookmark the permalink.

4 Responses to An Introduction to Brute-force and Elegance

  1. coreykpunches says:

    Nicely done, I look forward to seeing more!

  2. Pingback: C++ for beginners, part 1 | hercles

  3. Pingback: C++ for beginners, part 2 | hercles

  4. Pingback: C++ for beginners, part 4 (a binary digression) | hercles

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s