import collections from itertools import combinations, permutations import pandas as pd import numpy as np from sklearn.decomposition import PCA from geopy.geocoders import Nominatim from geopy.distance import vincenty %matplotlib inline import matplotlib.pyplot as plt
Three of my best friends and I decided to do the stereotypical thing and backpack around Europe after we graduated college. Three of us studied computer science, and the other one studied mechanical engineering. This is the story of some of the more engineering-y things we did for planning.
Our timeframe was constricted by various individual personal and work obligations, so our first real decision was where we wanted to go. Lance made a list of the top 25 places to go in Europe, and sent out an Excel spreadsheet wherein we all ranked each of the 25 destinations in order.
df = pd.read_excel('Europe_Rankings.xlsx') df.head()
|1||Normandy (on D-Day)||France||1||13||2||9|
By adding each place's rankings together and sorting by lowest overall rank, we decided on our top 8 placess.
names = ['Sam', 'Steve', 'Mauricio', 'Lance'] df['Total'] = sum(df[n] for n in names) df.sort_values('Total')[:9]
|1||Normandy (on D-Day)||France||1||13||2||9||25|
At this point, I was curious about how similar we were in our travel preferences. We can graph our preferences in a two-dimensional space by running our individual preferences through PCA.
pca_2 = PCA(2) vectors = [df[n].tolist()[:25] for n in names] plot_columns = pca_2.fit_transform(vectors) plt.title('Travel preferences (2D projection)') colors = ['red', 'yellow', 'black', 'lime'] for t in zip(colors, plot_columns.tolist(), names): plt.scatter(t, t, c=t, marker='o', label=t) plt.legend(loc='lower right') plt.show()
Now, if you put a pin on each of those cities and look at a map, the optimal route is fairly clear. But that didn't stop me from being curious about this particular instance of the travelling salesman problem.
cities = [(r['City'], r['Country']) for i, r in df.sort_values('Total').iterrows()][:9] cities = ('Omaha Beach', 'France') cities.append(('Dublin', 'Ireland')) # we were flying in/out of Dublin cities
[('Rome', 'Italy'), ('Omaha Beach', 'France'), ('Paris', 'France'), ('Prague', 'Czech Republic'), ('Berlin', 'Germany'), ('Giants Causeway', 'Ireland'), ('Amsterdam', 'Netherlands'), ('Vienna', 'Austria'), ('Budapest', 'Hungary'), ('Dublin', 'Ireland')]
First, we find the latitude and longitude of each destination.
gl = Nominatim() locations = [gl.geocode(ci + ' ' + co) for ci, co in cities] coded_cities = list(zip(cities, locations))
Then, we find the distance between each pair of cities. I'm using Vincenty's Formulae to get an as-the-crow-flies distance. Admittedly this isn't perfect, but it should be close enough.
weights = collections.defaultdict(lambda: collections.defaultdict(lambda: np.inf)) for city1, city2 in combinations(coded_cities, 2): name1, name2 = city1, city2 loc1 = (city1.latitude, city1.longitude) loc2 = (city2.latitude, city2.longitude) weights[name1][name2] = vincenty(loc1, loc2).miles weights[name2][name1] = vincenty(loc2, loc1).miles
middle_cities = cities[:-1] DUBLIN = ('Dublin', 'Ireland')
min_path_weight = np.inf min_path = None for p in permutations(middle_cities, r=len(middle_cities)): path_weight = weights[DUBLIN][p] for i in range(len(p) - 1): path_weight += weights[p[i]][p[i + 1]] path_weight += weights[p[-1]][DUBLIN] if path_weight < min_path_weight: min_path_weight = path_weight min_path = p
min_path_weight, [DUBLIN] + list(min_path) + [DUBLIN]
(3163.196958561114, [('Dublin', 'Ireland'), ('Giants Causeway', 'Ireland'), ('Amsterdam', 'Netherlands'), ('Berlin', 'Germany'), ('Prague', 'Czech Republic'), ('Vienna', 'Austria'), ('Budapest', 'Hungary'), ('Rome', 'Italy'), ('Paris', 'France'), ('Omaha Beach', 'France'), ('Dublin', 'Ireland')])
At this point, we needed to figure out what we were going to do in each place (and where we were going to sleep). We employed a divide and conquer strategy, putting each of us in charge of two or three cities. This strategy worked out well, because we know each other well and trust each other. Each person's process was different, but by and large we used HostelWorld, TripAdvisor, and various internet searches to figure out where to stay and what to do.
We had a great, mostly stress-free time. I highly recommend planning a trip this way!