© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Visualizing Emergent Identity of Assemblages in the
Internet of Things:
A Topological Data Analysis Approach
Paper&presented&at&2017&INFORMS&Marketing&Science&Conference,&Los&Angeles,&CA,&June&10,&2017
Tom$Novak$and$Donna$Hoffman,$The$George$Washington$University
© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Agenda
Use Topological Data
Analysis (TDA) to
operationalize
DeLanda’s (2002,
2006, 2011, 2016)
concept of an
assemblage’s
possibility space
(a.k.a. the market
structure of
underlying consumer
needs).
The Internet of Things (IOT) and
IFTTT
Topological Data Analysis (TDA)
Analysis of IFTTT Recipes with TDA
Future Directions
2
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
The IoT and IFTTT
3
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
IFTTT Codes IoT and Web
Interactions
4
Internet&of&Things&(IoT)
The$wide$range$of$everyday$objects$and$products$in$the$real$world$
that$are$enhanced$with$programmable$sensors$and$actuators$that$
communicate$with$other$devices$and$consumers$through$the$
Internet.$$$--Hoffman$and$Novak$2016
IFTTT.com
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Anatomy of an IFTTT Recipe
5
If any new post on Blogger then create a link post on Facebook
Trigger
Event
Trigger
Channel
Action
Event
Action
Channel
IFTTT (If-This-Then-That) Recipe
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
5 Years of IFTTT
6
Each day, 20 million IFTTT recipes are run by IFTTT users (Lunden 2015).
Users can choose to make their recipes public by publishing them. From 2011
to 2016, a total of 331,391 IFTTT recipes have been published.
Of these published recipes, 20,675 are unique. Variants of these unique
recipes have been published anywhere from 1 to 9273 times.
The 20,675 unique published IFTTT recipes use:
347 different trigger channels using 1110 different trigger events
297 different action channels using 591 different action events
Research Question: What is the topological structure of the 20,675 published
IFTTT recipes?
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
An Assemblage Emerges from the Interaction of its
Components
7
Consumer
Alexa
Hue
IFTTT
Assemblage
Interacting&Components
© Hoffman and Novak 2017 | http://postsocial.gwu.edu
The Possibility
Space and
Assemblages
TDA provides an
empirical way of
visualizing the
mechanism-
independent
possibility space,
given a population
of individual
assemblages.
Two requirements for emergence of an
assemblage (DeLanda 2011):
Mechanism-dependent: ongoing
recurrent processes involve
interaction among components of an
assemblage, through their exercised
paired capacities.
Mechanism-independent: a
mathematical topological structure, the
possibility space, contains points of
attraction that guide the recurrent
processes of assemblage. This leads to
the emergence of populations of
assemblages that wind up in the space
place, although “different trajectories may
be attracted to the same final state”
(DeLanda 2002, p 7).
8
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Topological Data
Analysis (TDA)
9
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Topological Data Analysis (TDA)
TDA (Carlsson 2009; Lum, et.al. 2012; Singh, Memoli and Carlsson
2007) uses computational topology techniques on complex high-
dimensional data to produce a three-dimensional topology of simplicial
complexes (discrete, combinatorial objects) in which groups of data are
represented as nodes that contain rows that are similar to each other
in the high-dimensional topological space and the edges connect nodes
that share rows.
10
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
How TDA Creates Topological Models
11
X
y
Bin 1
Bin 2
Bin 3
Bin 4
Bin 1
Bin 2
Bin 3
Bin 4
Slide images adapted courtesy of Ayasdi, Inc. http://ayasdi.com/
Step 1
Rectangular data.
Many rows, 2
columns x and y.
When plotted,
these data define
a circle.
Step 2
Map data
onto a single
number using
a function (or
lens). Here,
the lens is
the y
coordinate.
Step 3
Sort data into
overlapping bins,
based upon the
value of the lens.
Then, look at
values of the
original variables
in these bins.
Step 4
Cluster the original
data (i.e. x and y
values) within each
bin. Bins 1 and 4
have one cluster.
Bins 2 and 3 have
two clusters. This
results in 6 nodes.
Step 5
Connect nodes by
an edge if they
have rows in
common. The
shape of the
topological model
has meaning and
represents the
shape of the data.
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Analysis of IFTTT
Recipes with TDA
12
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
TDA Implementation
Open source
Python Mapper (Müllner and Babu 2013; Singh, Mémoli, and Carlsson 2007)
Kepler Mapper in Python (Triskelion 2015, proof of concept for Ayasdi flavor of TDA)
Dionysus (in C++ with Python bindings) based on Zomorodian and Carlsson (2005)
and Edelsbrunner,
Letscher and Zomorodian (2000) on computing persistent homology
Package TDA - R interface for GUDHI, Dionysus, PHAT
TDA Mapper R package using Mapper
JavaPlex library implements persistent homology for MATLAB and java-based systems
CTL = C++ library for computational topology
And others, see GitHub
Commercial
Ayasdi (Workbench Web platform and Python SDK)
13
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
IFTTT Recipes – Preparation for TDA
20,675 unique IFTTT published recipe text strings from 2011-2016 of the form:
IF say a specific phrase using_channel Amazon Alexa
THEN boost your hot water using_channel Hive Active Heating
Created binary variables for ngrams that had a frequency > 50, ignoring standard
stop words:
1 grams (triggers n=364 and actions n=264): “Alexa”
2 grams (triggers n=451 and actions n=353): “Amazon Alexa”
3 grams (triggers n=407 and actions n=324): Amazon Alexa add
PCA on 2163 ngrams (of these, 880 were non-redundant)
362 eigenvalues > 1
First 100 eigenvalues explained 51.04% of variance (first 2 explained 1.91%)
Obtained scores on first 100 components (variance = eigenvalue)
14
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Assumptions for TDA of Our IFTTT Data
Euclidean distance was used as the metric defining distance
between the rows of the data matrix.
Why used? Each row contains scores on the first 100 principal
components of ngrams, so Euclidean distance has a natural
interpretation.
Neighborhood Lenses 1 & 2 (coordinates of a k-nearest neighbors
graph of the data embedded in two-dimensions)
Why used? Magnifies differentiation among groups by locally
adapting distance/closeness.
Note: other metrics and lenses were also tried, but produced less clearly
interpretable topological modes.
15
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Topological Model of 5 years of
Published IFTTT Recipes from 2011-2016
32
71%$of$IFTTT$recipes$are$in$
nodes$of$one$large$
connected$component
10%$of$IFTTT$recipes$
are$in$nodes$that$
are$singletons
19%$of$IFTTT$recipes$are$in$nodes$of$13$small$connected$components
Ayasdi 6.7$Workbench$and$Python$SDK$used$to$generate$topological$models
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
16
367 nodes contain 14,692 IFTTT recipes (71% of all recipes).
Recipes can be in more than one node.
Nodes are connected by an edge if they have a recipe in common.
Topological Model of 5 years of
Published IFTTT Recipes from 2011-2016
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
17
Topological Models – Varying Resolution
(number of bins used by TDA to generate nodes)
Low$Resolution$Model$(resolution$=$25)
High$Resolution$Model$(resolution$=$60)
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
18
Topological Models – Varying Gain
(degree of overlap of IFTTT recipes within nodes)
Low$Gain$Model$(gain$=$1.4)
High$Gain$Model$(gain$=$2.8)
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
19
T=DateTime
A=RescueTime,$
LIFX,Wemo
T=Android$Device,$
Reddit,$Android$
Battery
A=Android$Device,$
Android$Wear
T=Facebook$Pages,$
Facebook,$Tumblr,$
Twitter
A=Facebook,$
Dropbox,$OneDrive,$
Box,$Tumblr,$
WordPress,$Blogger
z
T=Gmail
A=IF,Hue,$
Boxcar,$Wemo
T=Alexa,$Wemo,$
SmartThings,$Ring,$
ScoutAlarm,$Dlink
Motion,$Arlo
A=SmartThings,$$
Harmony,$
Manything
T=Location
T=Google$Calendar,$
Stocks,$Square
A=Skype,$
Pushbullet
T=Fitbit,$
Jawbone$UP,$
Misfit,$
Withings,$Nike+
A=Slack,$Hue,$
IF,$Gail,$SMS
T=Pocket,$WordPress,$
Tumblr,$Blogger,$Ebay,$
Instapaper,$Craigslist
A=Evernote,$Delicious,$
Diigo,$Pinboard,$
Tumblr,$WordPress,$
Blogger$
A=OneNote
T=Android$
SMS
A=Android$
Device
T=Youtube,$
Foursquare,$Vimeo,$
Flickr,$DailyMotion
A=Pocket,$Instapaper,$
Bitly,$Tumblr,$FB$
Pages,$Blogger
T=Inoreader,$
Tumblr,$Flickr,$
Twitter
A=Buffer
A=Google$Drive
T=Twitter,$Life$
360
A=Ecobee,$
Tumblr
T=Automatic,$
Spotify,$
Withings,$Dash,$
Toodledoo,$
FitBit
A=Google$Cal,$
Google$Glass,$
Jawbone$Up,$
SMS
A=Todooist,$
Beminder,$
Toodledo
T=Email
A=Hue,$
Wemo,
ManyThing
T=Weather,$
Space
A=Hue,$Nest,$
LittleBits,$LIFX,$
WeMo,$IF
T=IFTTT,$GitHub,$
Wemo,$Yo,$
Particle,$Stripe,$
Smappee
A=email,$Gmail,$
Pusbullet,$IF$
notifications,$
SMS
T=Alexa$
(say$specific$
phrase)
T=500px
A=Feedly
Interpretation via
IFTTT Channels
Groups$of$nodes$identified$with$network$clustering$using$Ayasdi’s Community$Algorithm$(Louvain$modularity$optimization).
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
20
IMAGE&AND&
VIDEO&SOCIAL&
CONTENT
SOCIAL&
MEDIA
HOME&IOT
INFORMATION
TASKS
WEARABLE&IOT
TDA Identifies 6 Broad Groups of
IFTTT Recipes
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
21
Year 1
Year 3
Year 5
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
22
487 Amazon Alexa IFTTT Recipes
”Social Alexa”
Consumer talks
to Alexa
”Alexa Automation
Alexa triggers other actions
Red$shading$indicates$that$>$50%$of$recipes$in$a$node$use$Amazon$Alexa.$
TDA$identifies$two$categories$of$Alexa$IFTTT$recipes$occupying$different$locations$in$the$
topological$model.$
SOCIAL&
MEDIA
HOME&IOT
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
33
TDA of Subset of 487 Alexa IFTTT Recipes
z
z
z
Social$Alexa
247$Recipes
Alexa$Automation
168$Recipes
Other$Alexa$Recipes
72$Recipes
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Future Directions
23
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
24
Removing Noise in Data Provides a Clearer Topological Model
TDA$of$2163$IFTTT$ngrams
(Hamming$Metric)$
TDA$of$100$principal$components
of$2163$IFTTT$ngrams
(Euclidean$Distance$Metric)$
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
25
Preliminary Analysis of Manually Coded IFTTT Rules
TDA$of$455$IFTTT$codes
Trigger$noun$and$verb$phrase$codes
Action$noun$and$verb$phrase$codes
(Hamming$Metric)$
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
28
Analysis of Manually Coded IFTTT Rules
THING TRIGGERS
THING ACTIONS
WEB TRIGGERS
WEB ACTIONS
WEB TRIGGERS
THING ACTIONS
THING TRIGGERS
WEB ACTIONS
© Hoffman and Novak 2017 | http://postsocial.gwu.edu
Future
Research
Directions
Methodological
Comparison of alternative approaches for
processing structured text data (IFTTT
recipes): N-grams, word2vec, doc2vec,
topic models, latent semantic analysis,
human coding.
Comparison of alternative visualization
approaches: TDA, network analysis, PCA,
MDS, hierarchical and k-means clustering
Substantive
Why are certain IFTTT recipes more
frequently published, favorited and
added?
What are the dynamics of how new IFTTT
recipes emerge over time?
29
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
30
Acknowledgments
The Ayasdi 6.7 software platform for
topological data analysis (ayasdi.com)
was used to construct all topological
models of the IFTTT data. The authors
acknowledge the support of Devi
Ramanan, Global Head Product
Collaborations, Ayasdi Inc., Menlo Park,
CA
IFTTT public recipe data from 2011-
2016 were provided by, and used with
permission of, IFTTT.com, San
Francisco, CA.
© Hoffman and Novak 2017 | http://postsocial.gwu.edu© Hoffman and Novak 2017 | http://postsocial.gwu.edu
31
postsocial.gwu.edu