Category: Code

Recursion / Recursive Functions

Recursion / Recursive Functions

[Image by Maurits Cornelis Escher]

A recursive function is a function which calls itself.

It sounds like a simple concept, but in practice it can be hard to get your head around.

Here’s a lovely example of a recursive function:

void ExplainRecursion () {  
    var explained = GetInput(“Do you understand recursion yet?”);  
    if (!explained) {    
        ExplainRecursion();  
    }
}

It’s particularly nice because it uses recursion to explain recursion. It calls itself – it’s a recursive function. Every time it calls itself, it asks the question “Has recursion been explained yet?” and if the answer is No it calls itself again. It keeps calling itself repeatedly until recursion is finally understood, at which point it quits. What might not be immediately obvious is that when it quits, it will still be inside the previous call, and will keep returning until it reaches the first time it was called. Like this:

ExplainRecursion()

—–>Do you understand recursion yet?

—–>No => ExplainRecursion()

———->Do you understand recursion yet?

———->No => ExplainRecursion()

—————>Do you understand recursion yet?

—————>No => ExplainRecursion()

——————–>Do you understand recursion yet?

——————–>Yes! => Exit function

—————>Exit function

———->Exit function

—–>Exit function

Exit function

It’s true that this is a slightly jokey example. It works if you see the source code, but if you only ever saw the output you wouldn’t learn much about recursion. You’d just be asked the same question repeatedly and probably get a bit angry.

Here’s another simple example:

void Factorial(int n) {
    if (n == 1) {
        return 1;
    } else {
        return n * Factorial(n-1);
    }
}

This function is calculating “n factorial”, which is written as “n!” in mathematical notation. In case you’ve forgotten or haven’t come across this concept before, it’s easiest to explain with a concrete example: 5! (“five factorial”) is a quick way of writing 5 x 4 x 3 x 2 x 1. So n! means take the number n and multiply it by all the whole positive numbers that lie between itself and zero.

The above function calls itself, which means that unless n is 1 (because 1 factorial just equals 1), it will return a result… to itself.

Don’t worry if you’re feeling lost already. It took me a long time to be able to look at recursive functions without getting an instant headache. I found it helpful to draw little diagrams. For instance, if we use the above function to calculate 5 factorial.

Remember, the answer we expect is 5 x 4 x 3 x 2 x 1.

The first time we enter the function, we call Factorial(5), so n = 5. The function returns 5 x Factorial(4), so it gets called again and this time n = 4. That returns 4 x Factorial(3), which returns 3 x Factorial(2), which returns 2 x Factorial(1). Factorial(1) just returns 1. Here’s a diagram to help:

Factorial(5) = 5 x Factorial(4)

—–>Factorial(4) = 4 x Factorial(3)

———->Factorial(3) = 3 x Factorial(2)

—————>Factorial(2) = 2 x Factorial(1)

——————–>Factorial(1) = 1

—————>= 2 x 1

———->= 3 x (2 x 1)

—–>= 4 x (3 x (2 x 1))

= 5 x (4 x (3 x (2 x 1)))

Normally when a function returns a value, it’s over. We can move on. But with a recursive function, we might just land back inside ourselves again, and then we might return a value to yet another version of ourselves. With complex functions, this can be hard to trace through in your head without losing track. It’s also horribly easy to get stuck in an infinite loop.

Hopefully this helped. If not, don’t worry. It’s not just you! It’s a notoriously head-twisty concept.

 

Liskov Substitution Principle

Some notes I made on the Liskov Substitution Principle (LSP) – which is the L in SOLID.

I can never remember the details of Liskov, which is why I made these notes.

These notes are also visible on my new personal wiki site, where I briefly describe the S, O, I and D in SOLID as well.

  • “objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.” See also design by contract.
  • Code that uses a base class must be able to substitute a subclass without knowing it
  • The specific functionality of the subclass may be different but must conform to the expected behaviour of the base class.
  • EXAMPLE:
    • So, I can call FemaleMammal.GiveBirth and I can substitute a cow into that – Cow.GiveBirth without breaking anything
    • No matter which version I choose, I should be confident that I will end up with a baby animal. Even though the specifics might be slightly different.
  • These are the principles which must be adhered to:
    • Contravariance with parameters
      • There should be contravariance between parameters of the base class’s methods and the matching parameters in subclasses. This means that the parameters in subclasses must either be the same types as those in the base class or must be less restrictive.
      • Parameters in subclasses are either the same or have more / add extra functionality
      • EXAMPLE:
        • FemaleMammal.GiveBirth(Mammal mothersBestFriend)
        • Cow.GiveBirth(Animal mothersBestFriend)
      • See Covariance and Contravariance
    • Covariance with return values
      • There must be covariance between method return values in the base class and its subclasses. This specifies that the subclass’ return types must be the same as, or more restrictive than, the base class’ return types.
      • Return values in subclasses are either the same or have less functionality
      • EXAMPLE:
        • Mammal baby = FemaleMammal.GiveBirth();
        • Bovine baby = Cow.GiveBirth();
      • See Covariance and Contravariance
    • Preconditions
      • The preconditions of a base class must not be strengthened by a subclass
      • A precondition of a class is a rule that must be in place before an action can be taken.
      • EXAMPLE:
        • FemaleMammal must not be a virgin and must never have had a Caesarian before giving birth
        • Cow must not be a virgin before giving birth
    • Postconditions
      • Postconditions cannot be weakened in subclasses.
      • Postconditions describe the state of objects after a process is completed.
      • EXAMPLE:
        • After female mammal gives birth, it must have an associated baby that is a mammal
        • After cow gives birth, it must have an associated baby that is a bovine (this is fine)
        • After cow gives birth, it must have an associated baby that is any kind of animal (this would not be fine)
    • Invariants
      • The invariants of a base class must not be changed by a subclass.
      • An invariant describes a condition of a process that is true before the process begins and remains true afterwards.
      • EXAMPLE:
        • FemaleMammal is female both before and after birth
        • Cow is female both before and after birth
    • History
      • New or modified members should not modify the state of an object in a manner that would not be permitted by the base class
      • EXAMPLE:
        • FemaleMammal gains a newly populated Baby property and the baby has a head
        • Cow gains a newly populated Baby property (this is fine)
        • Cow gains a newly populated CowUdders property (this is not fine)
    • Exceptions
      • A subclass should not throw exceptions that are not thrown by the base class unless they are subtypes of exceptions that may be thrown by the base class
      • EXAMPLE:
        • FemaleMammal.GiveBirth throws a BreachedBaby exception
        • Cow.GiveBirth throws a BreachedBabyCow exception (this is fine)
        • Cow.GiveBirth throws a NotEnoughGrass exception (this is not fine)
  • For instance, if you want a read-only file type, it should not inherit from File – and File should not have a Save method. Rather, the base File type has a Load method, and it has a subclass – WriteableFile – which has a Save method. Thus the base class cannot be saved. (otherwise you would have ReadOnlyFile which was not able to implement the Save method on its parent, and also left the object in a different state after calling Save than what you would expect – particularly if its Save method threw an exception which was not thrown by the base class)
Yak shaving

Yak shaving

There are some words and phrases whose precise meanings just won’t stick in my head. “Yak shaving” is one of them.

So I want to thank the author of this blog post, Yevgeniy Brikman, first of all for writing a great post about infrastructure, but also for providing a great visual aid to remind me what yak shaving is.

And I’m putting it here so that I can look it up, the next time I forget.

Here is a great visual example of yak shaving.

(Apologies, I would have embedded the giphy but I’m still using the web version of WordPress, which doesn’t seem to allow it. I will hopefully sort this out soonish).

Simultaneous equations and algorithms: Finding the “best fit” of cells in a grid

Simultaneous equations and algorithms: Finding the “best fit” of cells in a grid

I got pinged by my colleague @j_f_green this morning with the following message: “Hey Clare. Hope all is well. Was wondering if I could be cheeky and get your input on a (small) thing I’m trying to solve? I am searching for an elegant solution and I’m starting to suspect the answer may involve maths.”

…and of course I was instantly interested.

The problem

It turned out to be a relatively simple problem, depending on your point of view. If I say “GCSE-level simultaneous equations” your reaction will tell you whether that’s simple for you or not (for people not in theUK, GCSEs are the exams our pupils take when they are 16 years old).

Here is the problem: We want to discover two whole numbers (aka positive integers) that will give us the height and width of a grid in cells, given the following constraints:

  1. We have a number, let’s call it capacity, which represents the minimum number of cells to fill the grid. We want at least that many cells. We’re happy if we have more, but not if we have fewer.
  2. We also have another number, we’ll call it aspectRatio, which is the aspect ratio of the grid. This is the relationship between height and width. So for instance, if the aspect ratio is 2:1, that means the width is twice the height (it’s twice as wide as it is high). To reduce that to a single number,  aspectRatio is 2.

width = 2 * height = aspectRatio * height

(aspectRatio will often be a decimal or a fraction, so for instance if width is half height, aspectRatio is 0.5, or if the ratio is 2:3 then aspectRatio = 2/3 = 0.66, and width = 0.66 * height)

A concrete example

Here is a concrete example: Let’s say we have a grid that is twice as wide as it is high, so aspectRatio = 2. Let’s also say that our capacity is 200: We want to fit at least 200 cells into this space. I’ve deliberately chosen an easy example, and the answer in this case would be a width of 20 and a height of 10:

◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️
◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️◼️

Apologies if that’s making your eyes hurt. Is it making your eyes hurt? It’s making my eyes hurt. I’ll try and make a better diagram later. It’s not even got the correct aspect ratio on the page…

If you enjoy attacking maths problems, look away now. Have a go at solving it before you read on. You may come up with a better solution anyway.

So how do we compute the answer?

One way would be a loop. Here is some code that would do the job:

var capacity = 1000;
var aspectRatio = 1000 / 800 = 1.25;

int width = 0;
int height = 0;

while (width * height < capacity)
{
height += 1;
width = RoundToNearestInteger(height * aspectRatio);
}

Assert.AreEqual(height, 29);
Assert.AreEqual(width, 36);

Here we have a grid that has at least 1000 cells:
36 * 29 = 1044

It has roughly the correct aspect ratio:
36 / 29 = 1.24
1000 / 800 = 1.25

The problem with this is that it does not scale well. The higher the numbers, the longer the loop. But fear not – maths to the rescue.

A simple use case

Let’s start with a simple use case: imagine the aspect ratio is 2:1, and the width is twice the height. Also let’s say we’re trying to fit 200 pixels into the space.
Let’s call width and height x and y. So:
x * y = 200
and
x = 2 * y = 2y
If x = 2y then x * y is the same as 2y * y, which is 2y²:
2y² = 200
y² = 100
y = 10
x = 20 (because it’s 2y)

A generalised solution

The first block above gives us our two simultaneous equations.

We can generalise the same approach for all numbers. Let’s call the capacity a, and the ratio b. Using our equations from above :
x * y = a
and
x = b * y
These are our simultaneous equations. Using the same logic as above, we substitute y for x and end up with:
by² = a
Which means that
y² = a/b
and
y = sqrt(a/b)
(where “sqrt” is square root).
Then once you’ve calculated y, you can calculate x:
x = b * y
The only missing piece of the puzzle is that we are dealing with inequality, not equality. So really it’s this:
y >= sqrt(a/b)
This just means that once you get a result, you round it up to the nearest integer. Same applies to the final substitution : if b * y is not a whole number, then round it up.
When you calculate b * y, you should use y BEFORE rounding.
So to finish, let’s stop being mathsy and start being softwarey. Let’s give our variables some meaningful names:
height >= sqrt(capacity / aspectRatio)
width = aspectRatio * height
If we test it with the example we used in the loop above, where aspectRatio = 1.25 and capacity = 1000, we get the same results:
height >= 28.28 [= sqrt(1000 / 1.25)]
width = 36 [= RoundUp(1.25 * height)]
So height = 29, width = 36, aspectRatio = 36 / 29 = 1.24, num cells = 29 * 36 = 1044
Now we can finish with some code:
var heightUnrounded = SquareRoot(capacity / aspectRatio);
var height = RoundUp(heightUnrounded);
var width = RoundUp(heightUnrounded * aspectRatio);
Compared to the loop we started with, this scales brilliantly. It doesn’t matter how big the numbers are, the compute time is basically the same. It has a constant computational complexity, otherwise known as O(1)*. The solution above using a loop gets slower and slower depending on how big the numbers are. It has a computational complexity of O(n), so it’s not as good.
Hopefully that makes sense and I didn’t over-simplify. All feedback welcome.
* I couldn’t remember how to express O(1) so I initially guessed at O(k), because I was thinking of k as a constant. Sadly I was wrong, because I rather like the thumbs-up “Ok” concept.
Martin Fowler webinar, new Refactoring book

Martin Fowler webinar, new Refactoring book

These may well be the notiest notes I’ve ever published, but just in case they’re of any use to anyone… if nothing else they may whet your appetite for the new edition of Martin Fowler’s Refactoring book.

I confess I never read it first time round (not on purpose, just there are so many books in the world and so little time…), so I’m looking forward to reading it this time. It hasn’t come out yet in the UK but should be some time in the next few weeks. Mine is already on order [drums fingers impatiently].

So anyway, Martin did a webinar today on the topic of his new book, and here are my notes:

Refactoring should be done via small changes

  • v small semantics-preserving changes
  • So small they’re not worth doing on their own
  • String together small changes to make a big change

When adding new functionality:

  • Alternate between refactoring and adding functionality
  • Often it’s easier to add new functionality if you refactor first
  • Balance between adding new functionality and refactoring – it’s a matter of judgement

V1 of Martin’s Refactoring book vs v2

  • Some things that were included in the first book he decided not to include I the second
    • Eg Unidirectional vs bidirectional – not important
  • Some things he thought were too trivial for the first book, he has now changed his mind about and included
    • Eg moving statements within a function
  • Most notable changes = no longer centrally object-oriented
    • “Extract function” used throughout instead of extract method
    • One of the refactorings is around the choice between OO and non-OO
    • The most visible thing here is the choice of language (javascript)
    • But language is not necessarily relevant anyway
  • OO vs functional
    • He doesn’t see that as a huge shift
    • OO should still have functional elements – referentially transparent
    • They are overlapping paradigms, not distinct

Favourite refactorings?

  • “Split phase”
    • Hunk of computation which can sensibly be divided into two phases with a data structure communicating between them
    • Eg parsing – separate the tokenising out – deal with a series of tokens instead of a stream of text
    • But if you split a loop, what if you are introducing performance problems? See below…

Performance and refactoring

  • Split loop is something people often worry about because it will run the loop through twice
  • Most of the time it doesn’t matter – a push for clarity will not change the performance of the code
  • Most of the time by refactoring you open up an opportunity for performance improvements you would never have noticed otherwise
  • Then again you should run performance tests frequently
  • Not necessarily with every build, but every day or two
  • But most micro-changes have no impact on performance
  • You will only find real performance impact by performance testing

Recommendations for other books?

Refactoring architecture:

  • Always have to ask, how do you make the smallest possible change?
  • Eg extracting a data-intensive microservice from a larger system
  • Most of the moves are happening within the monolith BEFORE you start firing up the external system
  • Then you move stuff in really small pieces
  • Things that change for similar reasons should be together – you need to see the relationship between them

Branching / trunk-based development

  • Develop on a branch and merge at the end, with branch being long-lived – days, weeks or months
    • Still a really common pattern
  • But if something has been refactored, you don’t find out until you merge – this can blow up on you
    • It means you become scared to refactor! There’s a disincentive.
  • Continuous integration: Integrate with each other fully at regular intervals – at least once a day
  • Every day you receive everyone else’s code changes and push your own changes back
  • Removes the big painful merge problem – not saving up for a big horrible crunch
  • Adage of Apollo – it hurts if you do it more often (?)
    • Sorry, I wrote this down but I’ve since Googled it and I don’t really know what it’s about – I may have misheard!
  • Open source development:
    • Actually this CAN be a good place to use branches, because everyone working separately and you can’t guarantee all changes should be merged
  • People call it trunk-based development but “continuous integration” is a better name because that’s what it’s really about
  • People worry about continuous integration but it’s less painful than they think
    • If you do integration every day it becomes easier and less painful until you stop even noticing

Responsibility

  • Keep code base in healthy state
  • If not you are stealing from colleagues, clients and employers!
  • But no point saying you should do it for moral reasons, that won’t motivate people to do it
  • It’s economic: Want to get more changes out more quickly
  • It’s professional: We’re being paid to produce features rapidly
  • It’s not up to managers or business: They don’t know how to keep code healthy – that’s what they pay us to do

Refactoring and design patterns – are they incompatible?

  • With a lot of patterns, you shouldn’t introduce them right away cos they’re not appropriate and you don’t require the full strength of them yet, particularly the more complicated ones
  • Introduce patterns as a result of refactoring: Refactor towards them
  • This was recognised by the original Design Patterns authors (Gamma, Helm, Johnson, Vlissides)
  • Also Josh Kerievsky: Refactoring to Patterns

Code smells

  • Code smell of the week!
    • “This week we will look at long methods” – whenever we find them we try to refactor them

When not to refactor!

  • Code that is never touched
    • Sitting behind an API, working quite happily, doesn’t need to change – doesn’t matter if it’s a mess!
  • Don’t clean code all in one go – just a little bit more every time you go in
    • Over time that will concentrate your efforts where they are most needed – ie areas modified most frequently

Refactor as a step in the red-green-refactor cycle

  • You should only refactor when tests are green
  • Sometimes that means you have to remove a failing test, refactor, then put the failing test back in again
  • Ralph Johnson, John Brant, Don Roberts wrote the first Smalltalk refactoring book (A refactoring tool for smalltalk?)

If you don’t have tests you can’t refactor?

  • “Probably safe” refactorings? Interesting route to go? Not a great thing for most people to focus on, requires a lot of expertise
  • For most people you should have tests as well – after all, how do you know whether you’re breaking what’s currently there?

Final takeaway:

  • Small steps, commit frequently.
  • Then you can always roll back.
Getting Started with .Net Core, ASP and WebAPI

Getting Started with .Net Core, ASP and WebAPI

I followed this tutorial, which is very good: https://docs.microsoft.com/en-us/aspnet/core/tutorials/web-api-vsc?view=aspnetcore-2.1

There were some gotchas though, so (as always) I made notes:

  • Whenever you add new code you have to stop running and rebuild before you can send successful requests.
  • When you do a POST, the port number is 5001 or 5000: It’s whatever you’re using in the browser for GET requests
  • When you add the Update method, you have to send another POST to create a new Todo item before you can update it. The url is the location field from the response of the POST.
  • When you add the Delete method, use the GET request in the browser to find the Ids of existing items so you know what you can delete. The url is the same as the Update url, with appropriate Id added on. Use the GET request to check whether deletes have worked. Use the POST to add elements you can then delete.
  • When you add a new database context you need to update the dependency injection stuff in Startup.cs and you need to make sure your POCO class has an Id property.

See also the following two posts:

Converting a .Net Standard console app into a .Net Core console app

Converting a .Net Standard console app into a .Net Core console app

If you have a simple .Net console app and you’d like to convert it into a .Net Core app so that you can run it from the command line on any platform (as long as you install .Net Core), here’s what to do:

  • Remove the following files and folders from your project:
    • Bin folder
    • Obj folder
    • Packages folder
    • Properties / Assemblyinfo.cs
    • config
  • Create a new temp folder and run this command in that folder: dotnet new console
    • Take the csproj file created by that command and use it to replace the csproj file in your original project (using the name of the original csproj file).
    • !! If your original csproj file was in a nested folder, the new csproj file will have to be in the root.
      • But your program.cs file can stay where it was.
      • Or you can create a *.sln folder at the root level and add references to any nested csproj files.
        • To create a sln file: dotnet new sln
        • To add a csproj file (in a nested tests folder) to your sln file: dotnet sln add .\tests\tests.csproj
    • Command line: dotnet restore
    • Now you can build and run the new project from the command line: dotnet run
  • Check the old project for packages to include:
    • Look in packages.config
    • For instance, if you see a line that looks like this:
      • <package id=”NUnit” version=”3.11.0″ targetFramework=”net452″ />
    • …you need to run this on the command line: dotnet add package nunit -v 3.11.0
    • !!! If NUnit is one of the packages, you need to do some extra actions – see Adding NUnit tests to a .Net Core console app.
  • Gotchas:
    • You can’t build and run a console app in Visual Studio if the code is in Google Drive. You’ll get a very generic uninformative error – “Unable to start program … refresh the process list”.
    • If you try to add the NUnit package manually, you may get errors. Follow the actions in Adding NUnit tests to a .Net Core console app to fix.
      • Otherwise you may get the following errors:
      • 1) “Unable to find tests” Fix this with the following command (check version – this was Oct 2018): dotnet add package Microsoft.NET.Test.Sdk -v 15.7.2
      • 2) “Program has more than one entry point defined”. This is fixed by adding the following sub-element to a <PropertyGroup> element in your csproj file:
      • 3) “No test is available”. Fix this by adding the NUnit3TestAdapter package (at time of writing – Oct 2018 – this was version 3.10.0).

See also the following post: