The InvitAItion

Eleven Strategy - DSBA Hackathon

How can we generate the best guest list for a luxe event to increase revenues?

This was the hackathon objective presented to half of my MSc class by Eleven Strategy, a leading consulting firm based in Paris. After a random group allocation, we had 5 days to present the most convincing data-driven consulting proposal.

Context and Data provided

We were given around 55K transactions belonging to more than 13K different customers. Moreover, we had access to 10K event invitations, some of which were successful while others were not.

≈ 55K Transaction

≈ 13K Clients

10K Invitations

Our solution was tailored to a potential luxury business that wants to optimise the guests invited to luxury events. Optimal guests are those who are:

more likely to purchase a luxury item after having attended an event (classification model)
expected to spend the largest amount (regression model)

Current figures

After an initial data exploration, we noticed that only about 50% of the invitations sent were successful (the client attended the event). Therefore, the client's current invitation strategy is no better than flipping a coin. Moreover, we saw that most of the clients (93%) who accepted an invitation attended a single event. To have an unbiased estimate of the current average event gain we:

excluded the clients who attended multiple events
set a fixed window of 90 days to estimate the sales before and after a given event (gain)

≈ 50% Invitation Success

≈ 28K Mean Event Gain ($)

Our solution KPIs

Our solution will be evaluated on the following 3 KPIs:

Increase in average event gain → Business KPI
Increase in invitation success rate → Business KPI
Model performance/accuracy → Technical KPI

In particular, regarding model performance we have focused on recall for the classification model and MSE for the regression model.

Proposed methodology to assess the true impact of an event

To assess the true impact of an event, we first identified customers with similar purchasing behaviour before they attended an event. To do so, we used a simple K-means unsupervised clustering algorithm. This step is crucial to obtain a base line. We then compared the average change in purchasing behaviour between clients that belong to the same cluster, given than only some of them attended a certain event. The difference between these two deltas gave us the true event gain, as shown by the diagram below:

Clustering

For each cluster

For each event

AVG sales

before

the event

AVG sales

after

the event

Did not attend

event

Attended event

AVG ATTENDEES

GAIN

AVG sales

before

the event

AVG sales

after

the event

AVG BASE

GAIN

TRUE

EVENT

GAIN

Clustering

Following our clustering analysis, we identified 6 clusters:

Loyal Shoppers

Women's Style Enthusiasts

Men's Fashion Explorers

Rare Shoppers

Tiny Trends

Timepieces Lovers

# Clients

297 (2%)

8719 (67%)

2364 (18%)

1286 (10%)

257 (2%)

163 (1%)

Sales (€ M)

31 (30%)

58 (56%)

7 (8%)

3 (4%)

0.5 (0.5%)

2 (1%)

# Purchases

Customer life (Month)

To cluster the existing clients into different groups, we mainly focused on their purchasing behaviour. Variables such as:

purchasing amount
number of purchases
categories bought
customer life time (month)

are some of the indicators we used to cluster existing customers. Such variable are very strong indicators of future consumer behaviour, and should therefore be prioritised. On the other hand, demographic variables can be used to reach and find customers.

Successful event

How can we understand and visualise the true impact on sales of a given event? Let's follow the steps explained in the proposed methodology to visually assess the monetary average gain of a given event.

AVG

Sales ($)

Cluster 1

time

AVG sales from people belonging to a specific cluster and who attended a given a even

AVG sales from people belonging to a specific cluster and who did not attend a given a even

Cluster 2

Event

Gain due to event

Gain due to external factors

As we can see from the illustration above, for both clusters we notice that the average purchasing amount changed after the event took place. For example, if we focus just on cluster 1, both subgroups (attending and not attending people) increased their spendings. However, the magnitude of such increases is higher for those who attended the event than for those who did not. As both subgroups belong to the same cluster, we can assume that they had similar purchasing behaviours before the event took place. Therefore, the difference between the attendees gain and the base gain can be attributed to the event itself, and is highlighted by the blue rectangle. If we repeat the same procedure for all clusters and sum up all the correspondent event gains we obtain the total true event gain. We can categorise the event above as successful because there is a gain due to the event.

Unsuccessful event

On the contrary, below is the representation of an event that is not successful, as both subgroups of customers belonging to the same cluster experienced the same increase in spendings. Such an increase can be attributed to external factors such as improved ecomic conditions or seasonal events (e.g., Christmas).

AVG

Sales ($)

Cluster 1

time

AVG sales from people belonging to a specific cluster and who attended a given a even

AVG sales from people belonging to a specific cluster and who did not attend a given a even

Cluster 2

Event

Gain due to event (not present)

Gain due to external factors

Scoring Methodology

Now that we have understood how to classify an event as successful, let's focus on finding the optimal guests to invite. To quantify guests "optimality" we will use a simple scoring metric. We can obtain such metric by multiplying the following two figures:

client's expected purchasing amount after attending the event (regression model)
client's likelihood of attending an event after receiving an invitation (classification model)

Expected Sales of customer after attending an event

Monetary value

Likelihood

of attending an event after being invited

Probability

=

SCORE

Customers ranked in descending order

Monetary value

Automated platform

We can now focus on the product we would implement if the client were to choose us as its consultants. We prototyped a fully working platform that takes as inputs the characteristics of a new event. It then outputs the optimal guests list for such event. Among the input variables the client could choose:

type of event
product to promote
invitation channel
number of invitees
cost per attending invitee
other product-related variables

The optimal guests list given as output includes the clients with the highest score. The total gain is computed as the sum of the individual scores minus the total expected costs to organise the event (number of invitees x cost per invitee).

Additional Services offered

Being a data-driven consulting company, we also had to come up with additional services to sell to our client. Therefore, we thought of providing:

weekly training for an easy onboarding of the solution
maintenance of the model and dashboard
improved data collection process
continuous model retraining with new data

Trainings

Maintenance

Data Collection

Continuos Learning

Our Results

Compared to the initial situation, we were able to substantially improve important metrics. While the initial invitation rate of success was around 50%, we were able to take it all the way up to 89% accuracy, with a recall for successful invitation (category 1) of 91%. Regarding the average expected gain per event, our model predicted around €49K, a significant improvement over the initial €29K.

≈ 89% Invitation Success

≈ 49K Mean Event Gain ($)

However, it is important to notice that the competition's scope was not providing a perfect model/solution, as we only had 5 days to work on the project. The goal of the hackathon was crafting the most convincing data-consulting proposal, with the goal of being chosen by a potential client. In real life, providing a solid solution for such a project would require months of work and thorough analysis.

Skills acquired/improved and tools used in the project

Taking part on this hackathon was very demanding and rewarding at the same time. Random group allocations was another challenging factor, which is meant to replicate even more a real work environment. I was lucky enough to work with equally driven and qualified people, without whom it would have been much harder to win.

SKlearn

Pandas

Clustering

Client scoring

Team leading

Team building

I took the role of project leader, with the objective to motivate everyone, work together towards our common goal, and present our final solution. I believe that a good leader is someone able to:

listen to every team member
identify each member's strength
have a deep technical understanding
be humble enough to trust and delegate
be a strong believer and motivator

Every member of the team needs an environment where they feel valued, listened, and motivated.

This is why I truly believe that great leaders are great environment creators.