Home > Articles > Planning a trip to Europe like an engineer

In [1]:
import collections
from itertools import combinations, permutations

import pandas as pd
import numpy as np
from sklearn.decomposition import PCA

from geopy.geocoders import Nominatim
from geopy.distance import vincenty

%matplotlib inline
import matplotlib.pyplot as plt

Planning a Europe trip like an engineer

Three of my best friends and I decided to do the stereotypical thing and backpack around Europe after we graduated college. Three of us studied computer science, and the other one studied mechanical engineering. This is the story of some of the more engineering-y things we did for planning.

Deciding where to go

Our timeframe was constricted by various individual personal and work obligations, so our first real decision was where we wanted to go. Lance made a list of the top 25 places to go in Europe, and sent out an Excel spreadsheet wherein we all ranked each of the 25 destinations in order.

In [2]:
df = pd.read_excel('Europe_Rankings.xlsx')
df.head()
Out[2]:
City Country Lance Mauricio Sam Steve
0 Rome Italy 2 1 6 7
1 Normandy (on D-Day) France 1 13 2 9
2 Paris France 4 3 8 10
3 Prague Czech Republic 10 2 4 12
4 Berlin Germany 5 4 14 11

By adding each place's rankings together and sorting by lowest overall rank, we decided on our top 8 placess.

In [3]:
names = ['Sam', 'Steve', 'Mauricio', 'Lance']

df['Total'] = sum(df[n] for n in names)
df.sort_values('Total')[:9]
Out[3]:
City Country Lance Mauricio Sam Steve Total
0 Rome Italy 2 1 6 7 16
1 Normandy (on D-Day) France 1 13 2 9 25
2 Paris France 4 3 8 10 25
3 Prague Czech Republic 10 2 4 12 28
4 Berlin Germany 5 4 14 11 34
5 Giants Causeway Ireland 20 14 1 1 36
6 Amsterdam Netherlands 7 6 9 14 36
7 Vienna Austria 14 8 12 3 37
8 Budapest Hungary 6 12 3 18 39

At this point, I was curious about how similar we were in our travel preferences. We can graph our preferences in a two-dimensional space by running our individual preferences through PCA.

In [4]:
pca_2 = PCA(2)

vectors = [df[n].tolist()[:25] for n in names]

plot_columns = pca_2.fit_transform(vectors)

plt.title('Travel preferences (2D projection)')
colors = ['red', 'yellow', 'black', 'lime']
for t in zip(colors, plot_columns.tolist(), names):
    plt.scatter(t[1][0], t[1][1], c=t[0], marker='o', label=t[2])

plt.legend(loc='lower right')
plt.show()

Planning our route

Now, if you put a pin on each of those cities and look at a map, the optimal route is fairly clear. But that didn't stop me from being curious about this particular instance of the travelling salesman problem.

In [5]:
cities = [(r['City'], r['Country']) for i, r in df.sort_values('Total').iterrows()][:9]
cities[1] = ('Omaha Beach', 'France')
cities.append(('Dublin', 'Ireland'))  # we were flying in/out of Dublin
cities
Out[5]:
[('Rome', 'Italy'),
 ('Omaha Beach', 'France'),
 ('Paris', 'France'),
 ('Prague', 'Czech Republic'),
 ('Berlin', 'Germany'),
 ('Giants Causeway', 'Ireland'),
 ('Amsterdam', 'Netherlands'),
 ('Vienna', 'Austria'),
 ('Budapest', 'Hungary'),
 ('Dublin', 'Ireland')]

First, we find the latitude and longitude of each destination.

In [6]:
gl = Nominatim()
locations = [gl.geocode(ci + ' ' + co) for ci, co in cities]
coded_cities = list(zip(cities, locations))

Then, we find the distance between each pair of cities. I'm using Vincenty's Formulae to get an as-the-crow-flies distance. Admittedly this isn't perfect, but it should be close enough.

In [7]:
weights = collections.defaultdict(lambda: collections.defaultdict(lambda: np.inf))
for city1, city2 in combinations(coded_cities, 2):
    name1, name2 = city1[0], city2[0]
    loc1 = (city1[1].latitude, city1[1].longitude)
    loc2 = (city2[1].latitude, city2[1].longitude)
    
    weights[name1][name2] = vincenty(loc1, loc2).miles
    weights[name2][name1] = vincenty(loc2, loc1).miles
In [8]:
middle_cities = cities[:-1]
DUBLIN = ('Dublin', 'Ireland')
In [9]:
min_path_weight = np.inf
min_path = None
for p in permutations(middle_cities, r=len(middle_cities)):
    path_weight = weights[DUBLIN][p[0]]
    for i in range(len(p) - 1):
        path_weight += weights[p[i]][p[i + 1]]
    path_weight += weights[p[-1]][DUBLIN]
    
    if path_weight < min_path_weight:
        min_path_weight = path_weight
        min_path = p
In [10]:
min_path_weight, [DUBLIN] + list(min_path) + [DUBLIN]
Out[10]:
(3163.196958561114,
 [('Dublin', 'Ireland'),
  ('Giants Causeway', 'Ireland'),
  ('Amsterdam', 'Netherlands'),
  ('Berlin', 'Germany'),
  ('Prague', 'Czech Republic'),
  ('Vienna', 'Austria'),
  ('Budapest', 'Hungary'),
  ('Rome', 'Italy'),
  ('Paris', 'France'),
  ('Omaha Beach', 'France'),
  ('Dublin', 'Ireland')])

Planning each location

At this point, we needed to figure out what we were going to do in each place (and where we were going to sleep). We employed a divide and conquer strategy, putting each of us in charge of two or three cities. This strategy worked out well, because we know each other well and trust each other. Each person's process was different, but by and large we used HostelWorld, TripAdvisor, and various internet searches to figure out where to stay and what to do.

We had a great, mostly stress-free time. I highly recommend planning a trip this way!