R For Beginners: Basic Graphics Code to Produce Informative Graphs, Part Two, Working With Big Data

(This article was first published on r – R Statistics and Programming, and kindly contributed to R-bloggers)

R for beginners: Some basic graphics code to produce informative graphs, part two, working with big data

A tutorial by D. M. Wiig

In part one of this tutorial I discussed the use of R code to produce 3d scatterplots. This is a useful way to produce visual results of multi- variate linear regression models. While visual displays using scatterplots is a useful tool when using most datasets it becomes much more of a challenge when analyzing big data. These types of databases can contain tens of thousands or even millions of cases and hundreds of variables.

Working with these types of data sets involves a number of challenges. If a researcher is interested in using visual presentations such as scatterplots this can be a daunting task. I will start by discussing how scatterplots can be used to provide meaningful visual representation of the relationship between two variables in a simple bivariate model.

To start I will construct a theoretical data set that consists of ten thousand x and y pairs of observations. One method that can be used to accomplish this is to use the R rnorm() function to generate a set of random integers with a specified mean and standard deviation. I will use this function to generate both the x and y variable.

Before starting this tutorial make sure that R is running and that the datasets, LSD, and stats packages have been installed. Use the following code to generate the x and y values such that the mean of x= 10 with a standard deviation of 7, and the mean of y=7 with a standard deviation of 3:

##############################################
## make sure package LSD is loaded
##
library(LSD)
x <- rnorm(50000, mean=10, sd=15)   # # generates x values #stores results in variable x
y <- rnorm(50000, mean=7, sd=3)    ## generates y values #stores results in variable y
####################################################

Now the scatterplot can be created using the code:

##############################################
## plot randomly generated x and y values
##
plot(x,y, main=”Scatterplot of 50,000 points”)
####################################################

screenshot-graphics-device-number-2-active-%27rkward%27

As can be seen the resulting plot is mostly a mass of black with relatively few individual x and y points shown other than the outliers.  We can do a quick histogram on the x values and the y values to check the normality of the resulting distribution. This shown in the code below:
####################################################
## show histogram of x and y distribution
####################################################
hist(x)   ## histogram for x mean=10; sd=15; n=50,000
##
hist(y)   ## histogram for y mean=7; sd=3; n-50,000
####################################################

screenshot-graphics-device-number-2-active-%27rkward%27-5

screenshot-graphics-device-number-2-active-%27rkward%27-4

The histogram shows a normal distribution for both variables. As is expected, in the x vs. y scatterplot the center mass of points is located at the x = 10; y=7 coordinate of the graph as this coordinate contains the mean of each distribution. A more meaningful scatterplot of the dataset can be generated using a the R functions smoothScatter() and heatscatter(). The smoothScatter() function is located in the graphics package and the heatscatter() function is located in the LSD package.

The smoothScatter() function creates a smoothed color density representation of a scatterplot. This allows for a better visual representation of the density of individual values for the x and y pairs. To use the smoothScatter() function with the large dataset created above use the following code:

##############################################
## use smoothScatter function to visualize the scatterplot of #50,000 x ## and y values
## the x and y values should still be in the workspace as #created  above with the rnorm() function
##
smoothScatter(x, y, main = “Smoothed Color Density Representation of 50,000 (x,y) Coordinates”)
##
####################################################

screenshot-graphics-device-number-2-active-%27rkward%27-6

The resulting plot shows several bands of density surrounding the coordinates x=10, y=7 which are the means of the two distributions rather than an indistinguishable mass of dark points.

Similar results can be obtained using the heatscatter() function. This function produces a similar visual based on densities that are represented as color bands. As indicated above, the LSD package should be installed and loaded to access the heatscatter() function. The resulting code is:

##############################################
## produce a heatscatter plot of x and y
##
library(LSD)
heatscatter(x,y, main=”Heat Color Density Representation of 50,000 (x, y) Coordinates”) ## function heatscatter() with #n=50,000
####################################################

screenshot-graphics-device-number-2-active-%27rkward%27-7

In comparing this plot with the smoothScatter() plot one can more clearly see the distinctive density bands surrounding the coordinates x=10, y=7. You may also notice depending on the computer you are using that there is a noticeably longer processing time required to produce the heatscatter() plot.

This tutorial has hopefully provided some useful information relative to visual displays of large data sets. In the next segment I will discuss how these techniques can be used on a live database containing millions of cases.

To leave a comment for the author, please follow the link and comment on their blog: r – R Statistics and Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…

from R-bloggers https://www.r-bloggers.com/r-for-beginners-basic-graphics-code-to-produce-informative-graphs-part-two-working-with-big-data/

Nagbot Sends Mean Texts to Help You Stick to Your Resolutions

If you’ve made resolutions, you don’t want to forget about them when the novelty of the new year has worn off. Nagbot is a fun texting app that helps you stick to your goals by texting you regular reminders.

To create a nag, you enter your name, phone number, and goal into Nagbot’s website. From there, you can choose how often and when you want Nagbot to nag you. You’ll also decide how mean you want it to be, from “You’ll get ‘em next time!” to “You’re dead to me.” The tool is similar to apps like Streaks in that it sends you daily (or weekly) reminders to work on a task. However, it’s a little sassier and simpler, and perhaps more importantly, it’s free. You can also opt out of the texts whenever you want with “STOP.”

Nagbot has a really simple goal tracking function, too. When it texts you, you get a link you can use to check your goal progress on Nagbot’s website. You obviously have to provide your number, but their privacy policy explicitly states, “Nagbot will not rent or sell potentially personally-identifying and personally-identifying information to anyone.”

Head to the link below to give it a try.

Nagbot via ProductHunt

from Lifehacker, tips and downloads for getting things done http://lifehacker.com/nagbot-sends-mean-texts-to-help-you-stick-to-your-resol-1791188092

Why The Law of Large Numbers is Just an Excuse

big_numberEveryone has tough quarters, and usually, at least one tough year (more on that here).

As we approach $10m, and then again as we approach $20m, and then again as we approach $X0m … we often blame a factor that I believe rarely is really real — The Law of Large Numbers.

The Law of Large Numbers is really two excuses rolled into one:

  • Our market isn’t that big.  So of course, by $Xm in ARR, or $1Xm in ARR, growth is going to slow down a lot.  We’re doing as well as we can be expected, given how niche our market is.
  • No way we can add that much more revenue this year.  The real challenge in SaaS is that, let’s say you just want to go from $5m last year to $10m  this year.  That means, net of churn, you have to sell more this year than every other year before combined.  If you aren’t just crushing it, that can feel close to impossible.  How can we add as much or more revenue this year than every past year put together?  Goodness.

So you excuse slower growth this year due to the Law of Large Numbers.

Now if that were true, it would indeed be the perfect excuse.  But bear in mind, there are several “counter-winds” to the Law of Large Numbers:

  • Every SaaS Market is Bigger Than Ever.  Look at the latest batch of IPOs, from Twilio to Coupa to Appdynamics and more.  They are all growing at 70%+ at $100m+ in ARR (more on that here).  It’s not because they are “better” than the last generation of SaaS IPOs.  It’s because the markets are bigger.  And if all SaaS markets are bigger, if every segment of business is moving more and more to the cloud … then even if the Law of Large Numbers is true for you … it should be true later.
  • Upsell, Net Negative Churn, and Second Order Revenue Come to the Rescue.  If you have happy customers and high NPS/CSAT … then at least around $4m-$5m in ARR, generally, your existing customer base itself starts to create real revenue.  As a rough rule, aim for at least 120% net revenue from your trailing customer base by the time you hit $4m-$5m in ARR.  That means that, if you execute well here, a big chunk of your growth for this year comes from customers you already closed last year.  That means it gets easier, folks.  Because last year you didn’t have the big base to upsell to.
  • Screen Shot 2017-01-16 at 10.35.39 AMEveryone Gets Better at Driving Up Deal Sizes and ACVs.  Over time, everyone learns their customer base and how to add more value.  The combination usually means you are able to drive up deal sizes, pricing, and ACV.  If you drive up the average deal size just 10-20% this year, that again makes growing your total ARR easier.  It’s just math.  Everyone that gets good at selling a product learns how at least to drive deal sizes up at least a smidge.  Everyone.
  • Your Brand Boosts Marketing (and Pricing).  Once you hit just a few million in ARR, you’ll start to develop a mini-brand.  And once you hit $10m in ARR or so, you’ll almost certainly have a real brand in your space.  Once you have a brand, even with a mediocre marketing team, you’ll get pulled into more and more deals.  And once you have a trusted brand, you can change at the high end of the market.  The combination of the two makes it easier to scale.  If you have a positive, high NPS brand in your space and you aren’t getting better and better leads … you’re marketing team is simply terrible.  Make a change tomorrow.  Maybe even tonight.
  • Your Team Gets Better.  This is why you want zero voluntary attrition in your sales team.  Or probably, your customer success, team too.  And your demand gen team.  Everyone that is truly good gets better.  Your best sales reps just have it dialed in.  Your CS team knows exactly where the land mines are in saving customers, and how to get them to buy more seats.  Everyone just gets better in their second and third year.  This makes growing revenue easier, too.  Your team just wasn’t as seasoned last year.
  • The Great Teams Figure it Out.  And finally, let’s be clear.  The Law of Large Numbers does hit you earlier if you don’t expand your market, and redefine it.  Your very initial 1.0 product may only have a $10m TAM.  But the best teams always expand and redefine their markets.  It’s not easy.  But they always get it done before it impacts ARR growth materially.

So my hope here is that, if nothing else, we’ve challenged your anxiety around the Law of Large Markets.  I had this anxiety, myself.  We probably all do.

But great teams solve it.

And if you are hitting a LOLN wall … that’s a clear sign.  A clear sign:

  • You are way behind in adding, and/or upgrading, your senior team.
  • And probably a clear sign you are behind on NPS and CSAT.

Just upgrade those two.

You will probably get right back on track.

from SaaStr http://www.saastr.com/why-the-law-of-large-numbers-is-just-an-excuse/

What Happened When I Stopped Saying “Sorry” At Work For A Week

We all say “I’m sorry” too often—that much you already know. And, trust me, I’m right in that boat with you. I’m consciously aware of the fact that I’m a chronic over-apologizer.

Sure, I’ve read the countless articles about apps that could help me and little tweaks that could stop me in my tracks before those two small words mindlessly fly out of my mouth. But in all honesty, very little of it has worked for me. Nothing really sticks, and I still catch myself apologizing way more often than I should.

That is, until recently. I saw this Tumblr post circulating around the internet, and it piqued my interest.

Instead of attempting to stop yourself from saying something altogether, the user suggests replacing that oft-repeated “I’m sorry” with two different words: “Thank you.” This flips the script and changes something that could be perceived as a negative mistake into a moment for you to express your gratitude and appreciation.

Sounds great in theory, right? But how practical could it actually be? Would this be yet another suggested phrase that gets thrown out of the window the second I feel tempted to apologize?

Naturally, I felt the need to test it out myself—which is exactly what I’ve been doing over the course of the past week. It involves quite a bit of conscious thought (yes, there have been plenty of times when an apology was dancing on my lips, and I managed to catch it just in time). But so far I’ve managed to be pretty consistent with this change.

When an editor pointed out an error I had made in one of my articles, I didn’t respond immediately with, “Ugh, I’m so sorry about that!” Instead, I sent a reply with a line that read, “Thank you for that helpful note!”

And like the Tumblr user, when I ran late for a coffee meeting with a networking acquaintance, I resisted the urge to apologize profusely and instead thanked her for waiting for me.

While it does take a little bit of effort on your end (and, fair warning, you might slip up a few times at first), swapping out these words is still a relatively small change for you to make. But rest assured, so far I’ve noticed a big impact—more so with myself than with the people I had been apologizing to.

When I had previously spewed out countless sorries, I spent a good chunk of time feeling guilty. I had begun our exchange with something negative, which then seemed to cast a dark shadow over the rest of our conversation—like I had started things off on the wrong foot and needed to spend the rest of my time proving myself and recovering for my faux pas.

But by switching that negative to a positive, I found that I could move on from my slip-up much faster. I didn’t need to spend time mentally obsessing over what I had screwed up, because my genuine “thank you” had provided a much more natural segue into a different discussion—rather than the awkward exchange that typically follows an apology.

Needless to say, this is a change I plan to continue to implement to improve my communication skills. It’s the only thing I’ve found that actually halts my over-apologizing. And as an added bonus, it transforms those previously remorse-filled exchanges into something constructive and upbeat. What more could you want?


This article originally appeared on The Daily Muse and is reprinted with permission.

from Co.Labs https://www.fastcompany.com/3067232/what-happened-when-i-stopped-saying-sorry-at-work-for-a-week?partner=rss

A Designer’s Guide to Perceived Performance

A well-designed site isn’t how easy it is to use or how elegant it looks. A site isn’t well-designed unless the user is satisfied with their experience. An overlooked aspect of this experience is performance. A slow, beautiful site will always be less satisfying to use than an inelegant fast site. It takes a user just three seconds to decide to abandon a website.

“To the typical user, speed doesn’t only mean performance. Users’ perception of your site’s speed is heavily influenced by their overall experience, including how efficiently they can get what they want out of your site and how responsive your site feels.” – Roma Shah, User Experience Researcher

“A slow, beautiful site will always be less satisfying to use than an inelegant fast site.”

At the surface, performance is achieved through compression, cutting out extra lines of code and more, but there are limits to what can be achieved at a technological level. Designers need to consider the perceived performance of an experience to make it feel fast.

“There are two kinds of time: clock time and brain time.”

There are two kinds of time: clock time and brain time. The former is the objective measure of time; the latter is how a person perceives time. This is important to people involved in human-computer interaction, because we can manipulate a person’s perception of time. In our industry, this manipulation is called the perception of performance.

How Quick is Appropriate?

This visual demonstrates how we perceive time. Anything less than one second is perceived as ‘instant’ behaviour, it is almost unnoticeable. Up to one second is immediate, anything more than this is when the user realises they are waiting.

Instant behaviour could be an interface providing feedback. The user should not have to wait for this, they should get a message within 0.2s of clicking a button.

Immediate behaviour could be a page loading. The user should not have to wait any more than 1 or 2 seconds for the results they want to load.

If an interface needs that extra time, we should say ‘this may take a few more seconds’ and provide feedback on how long it will take. Don’t leave the user asking too many questions.

Active & Passive Modes

Humans do not like waiting. We need to consider the different modes a person is in when using a website or application: the active and passive modes. During the active mode users do not realise they are waiting at all; during the passive mode their brain activity drops and they get bored.

“It takes a user just three seconds to decide to abandon a website.”

You can keep people in the active mode by pre-loading content. Modern browsers do this while you are typing in a URL or searching in the address bar. Instagram achieves this by beginning to upload photographs in the background the moment you choose a photograph and starting creating the post to make the upload feel instant.

instagram

Instagram also shows an obscured preview of images that have not yet loaded.

“As designers, we should do everything we can to keep our users in the active mode.”

Display content as soon as you can to reduce the amount of time a user is in the passive mode. YouTube does this by streaming the video to the user despite it not being 100% downloaded. Instead, it estimates how fast the user can stream, and waits for that portion of the video to load, automatically chooses a bitrate, and starts playing it. Only buffering when absolutely necessary.

Both methods require us to prioritise the content we want, and load the rest of the page around it.

“Your page needs to load 20% faster for your users to notice any difference.”

Your page needs to load 20% faster for your users to notice any difference. If your page takes 8s to load today, a new version needs to take 6.4s to load for it to feel faster. Anything less than 20% is difficult to justify.

Helping Developers

Even if you understand all aspects of page speed, you should be thinking about it the moment you start creating a design system for a UI, working with the development team to fine-tune performance and figure out where marginal gains can be had.

This could be as simple as ensuring you provide loading states and fallbacks (failed states) to your developers so the user doesn’t have to wait for the entire page to load before they can read anything.

Here’s a short step-by-step guide to ensuring you are considering performance when designing:

  • Research the priority content that should load in your interface. If it’s a news article, the text content should load first, allowing the user to start reading before the experience has even finished loading.
  • Provide a loading state (e.g. placeholder content) and a fallback (e.g. un-styled text) for all elements you design and use.
  • Work with the developers to fine tune performance and work out what technologies can be used to ensure quick loading (e.g. browser caching and progressive jpegs).
slack-placeholder-content

Slack take a common approach of placeholder content to imply what the user is going to see, making progress feel faster than it is. A blank screen here would be frustrating.

These tasks may seem complete, but it is important to revisit your work and fine-tune to make as many marginal gains as possible.

Measuring Performance

A way to measure perceived performance is by inviting users to navigate your site and asking them to estimate how long it took to load. Another option is to provide multiple experiences and ask which is faster.

perceived-performance-survey

A survey can be as simple as a scale like this one. Get enough answers, and you have a clear average.

The sample should be large enough to gather a realistic average that takes into account different perceptions and, if remote, the varying connection speeds of your participants.

“A site isn’t well-designed unless the user is satisfied with their experience.”

Once you have measured the perceived performance, you should continue to tweak it, perform research, and make further improvements. Things can only get better. Keep tweaking until it’s at a point you’re happy with it, then tweak again.

Further Reading

If you want to dig deeper into the perception of speed, check out the following resources:

from Sidebar http://sidebar.io/out?url=https%3A%2F%2Fblog.marvelapp.com%2Fa-designers-guide-to-perceived-performance%2F

Deep Learning Can be Applied to Natural Language Processing



By Carlos Perez, Intuition Machine.


Image credit

There is an article going around the rounds at LinkedIn that attempts to make an argument against the use of Deep Learning in the domain of NLP. The article written by Riza Berkan “Is Google Hyping it? Why Deep Learning cannot be Applied to Natural Languages Easily” has several arguments about DL cannot possibly work and that Google is exaggerating its claims. The latter argument is of course borderline conspiracy theory.

Yannick Vesley has written a rebuttal “Neural Networks are Quite Neat: a Reply to Riza Berkan” where he makes his arguments on each point that Berkan makes. Vesley’s points are on the mark, however one can not ignore the feeling that DL theory has a few unexplained parts in it.

However, before I do get into that, I think it is very important for readers to understand that DL currently is an experimental science. That is, DL capabilities are actually discovered by researchers by surprise. There are certainly a lot of engineering that goes into the optimization and improvement of these machines. However, its capabilities are ‘unreasonably effective’, in short, we don’t have very good theories to explain its capabilities.

It is clear that there are gaps in understanding are in at least 3 open questions:

  1. How is DL able to search high dimensional discrete spaces?
  2. How is DL able to perform generalization if it appears to be performing rote memorization?
  3. How does (1) and (2) arise from simple components?

Berkan’s arguments exploit our current lack of a solid explanation with his own alternative approach. He is arguing that a symbolicist approach is the road to salvation. Unfortunately, no where in his arguments does he reveal the brittleness of the symbolicist approach, the lack of generalization and the lack of scalability. Has anyone created a rule based system that is able to classify images based on low level features that rivals DL? I don’t think so.

DL practitioners, however, aren’t stopping their work just because they don’t have air tight theoretical foundations. DL works and works surprisingly well. DL at is present state is an experimental science and it is absolutely clear that there is something going on underneath the covers that we don’t fully understand. A lack of understanding however does not invalidate the approach.

To understand the issues better, I wrote in an earlier article about “Architecture Ilities found in Deep Learning Systems”. I basically spell out the 3 capabilities in DL:

  • Expressibility — This quality describes how well a machine can approximate universal functions.
  • Trainability — How well and quickly a DL system can learn its problem.
  • Generalizability — How well machine can perform predictions on data that it has not been trained on.

There are of course other capabilities that also need to be considered in DL: Interpretability, modularity, transferability, latency, adversarial stability and security. But these are the main ones.

To get our bearing right about explaining all of these, we have to consider the latest experimental evidences. I’ve written about this here “Rethinking Generalization” which I summarize again:

The ICLR 2017 submission “Understanding Deep Learning required Rethinking Generalization“ is certainly going to disrupt our understanding of Deep Learning . Here is a summary of what the had discovered through experiments:

1. The effective capacity of neural networks is large enough for a brute-force memorization of the entire data set.

2. Even optimization on random labels remains easy. In fact, training time increases only by a small constant factor compared with training on the true labels.

3. Randomizing labels is solely a data transformation, leaving all other properties of the learning problem unchanged.

The point here that surprises most Machine Learning practitioners is the ‘brute-force memorization’. See, ML has always been about curve fitting. In curve fitting you find a sparse set of parameters that describe your curve and you use that to fit the data. The generalization that comes into play relates to the ability to interpolate between points. The major disconnect here is that DL have exhibited impressive generalization, yet it cannot possibly work if we consider them as just memory stores.

However, if we consider them as holographic memory stores, then that problem of generalization has a decent explanation. In “Deep Learning are Holographic Memories” I point out the experimental evidence that:

The Swapout learning procedure which tells us that if you sample any subnetwork of the entire network the resulting prediction will be the similar to any other subnetwork you look sample. Just like holographic memory where you can slice of pieces and still recreate the whole.

As it turns out, the universe itself is driven by a similar theory called the Holographic Principle. In fact, this serves as a very good base camp to begin a more solid explanation of the capabilities of Deep Learning. I introduce the “The Holographic Principle: Why Deep Learning Works” where I introduce a technical approach of using Tensor Networks that performs a reduction of the high dimensional problem space into a space that is computable within acceptable response times.

So going back again to the question about wether NLP can be handled by Deep Learning approaches. We certainly know that it can work, afterall, are you not reading and comprehending this text?

There certainly is a lot of confusion in the ranks of expert data scientists and ML practitioners. I was aware of the existence of this “push back” when I wrote: “11 Arguments that Experts get Wrong about Deep Learning”. However, Deep Learning likely can be best explained by a simple intuition that can be explained to a five year old:


DE3p Larenn1g wrok smliair to hOw biarns wrok.

Tehse mahcnies wrok by s33nig f22Uy pa773rns and cnonc3t1ng t3Hm t0 fU22y cnoc3tps. T3hy wRok l4y3r by ly43r, j5ut lK1e A f1l73r, t4k1NG cmopl3x sc3n3s aNd br3k41ng tH3m dwon itno s1pmLe iD34s.

A symbolic system cannot read this, however a human can.

In 2015, Chris Manning, an NLP practitioner wrote about the concerns of the field regarding Deep Learning (see: Computational Linguistics and Deep Learning). It is very important to take note of his arguments since his arguments are not in conflict with the capabilities of Deep Learning. His two arguments why NLP experts need not worry are as follows:

(1) It just has to be wonderful for our field for the smartest and most influential people in machine learning to be saying that NLP is the problem area to focus on; and (2) Our field is the domain science of language technology; it’s not about the best method of machine learning — the central issue remains the domain problems.

The first argument isn’t a criticism of Deep Learning. The second argument explains that he doesn’t believe in one-size-fits-all generic machine learning that works for all domains. That is not in conflict with the above Holographic Principle approach that indicates the importance of the network structure.

To conclude, I hope this article puts an end to the discussion that DL is not applicable to NLP.

If perhaps you still aren’t convinced, then maybe Chris Manning himself should convince you himself:

Bio: Carlos Perez is a software developer presently writing a book on “Design Patterns for Deep Learning”. This is where he sources his ideas for his blog posts.

Original. Reposted with permission.

Related:



from KDnuggets http://www.kdnuggets.com/2017/01/deep-learning-applied-natural-language-processing.html

Fraugster, a startup that uses AI to detect payment fraud, raises $5M

Fraugster, a German and Israeli startup that has developed Artificial Intelligence (AI) technology to help eliminate payment fraud, has raised $5 million in funding.

Earlybird led the round, alongside existing investors Speedinvest, Seedcamp and an unnamed large Swiss family office. The new capital will be used to add to Fraugster’s headcount as it expands internationally.

Founded in 2014 by Max Laemmle, who previously co-founded payment gateway company Better Payment, and Chen Zamir, who I’m told has spent more than a decade in different analytics and risk management roles including five years at PayPal, Fraugster says it’s already handling almost $15 billion in transaction volume for “several thousand” international merchants and payment service providers, including (and most notably) Visa.

Its AI-powered fraud detection technology learns from each transaction in real-time and claims to be able to anticipate fraudulent attacks even before they happen. The result is that Fraugster can reduce fraud by 70 per cent while increasing conversion rates by as much as 35 per cent. The point of any fraud detection technology, AI-driven or otherwise, is to stop fraudulent transactions whilst eliminating false positives.

“We founded Fraugster because the entire payment risk market is based on outdated technology,” the startup’s CEO and co-founder Max Laemmle tells me. “Existing rule-based systems as well as classical machine learning solutions are expensive and too slow to adapt to new fraud patterns in real-time. We have invented a self-learning algorithm that mimics the thought process of a human analyst, but with the scalability of a machine, and gives decisions in as little as 15 milliseconds”.

Once integrated, Fraugster starts collecting transaction data points such as name, email address, and billing and shipping address. This is then enriched with around 2,000 extra data points, such as an IP latency check to measure the real distance from the user, IP connection type, distance between key strokes, and email name match. Then the enriched dataset is sent to the AI engine for analysis.

“At the heart of our AI engine is a very powerful algorithm which can mimic the thought process of a human analyst reviewing a transaction. As a result, we can analyze the story behind every transaction and say with precision which transactions are fraud and which aren’t,” explains Laemmle.

“You get a score or decision. Results are completely transparent (and not a black box), so you can understand exactly why a transaction was blocked or accepted. On top of this, our speeds are as low as 15ms. The reason why we’re so fast is because we’ve invented our own in-memory database technology”.

Fraugster cites competitors as incumbent enterprise level companies like FICO or SAS, which, it claims are based on outdated technology.

Adds Laemmle: “At Fraugster, we do not use any rules, models or pre-defined segments. We don’t use a single fixed algorithm to analyze transactions either. Our engine reinvents itself with every new transaction. This lets us understand transactions individually and therefore decide which one is fraudulent and which one isn’t. As a result, we can offer unprecedented accuracy and the ability to foresee fraudulent transactions before they happen”.

from TechCrunch https://techcrunch.com/2017/01/16/fraugster/?ncid=rss

The 3 tenets of applied ethnography

User experience design is an intriguing field.


It’s relatively new, and relatively subjective. When designing a user experience, there’s a lot of judgement involved. For every piece of quantitative data we can use, there’s a piece of qualitative data that must be interpreted.


And even when there’s a wealth of quantitative data, we still must apply design logic—a practice that varies from designer to designer.


At first, this requirement may seem like a weakness, but it’s actually one of the design field’s greatest strengths: the interpretation and application of data in varying ways is why we see innovation in UX.


The industry giants in particular continue to set a rapid pace of innovation. Meanwhile, the day-to-day design process may have become a little stale for many “regular” UX designers.



“UX is an extension of psychology.”




So how can we designers jumpstart our innovative engines? Interestingly enough, many UX designers often overlook one of the richest sources of virtually free (and objective) innovation and inspiration: psychology.


Related: Design and the psychology of time


This shouldn’t come as much of a surprise—UX is essentially an extension of psychology. Indeed, UX is an application of psychological principles in a context that traditionally has little to do with psychology.


The practice of ethnography has many applications in UX. Let’s discuss some ways to apply ethnographic research to UX design.


Ethnography in UX




You’ve probably already encountered ethnography during the course of your UX career thus far, either in practice or in theory.
Unfortunately, most, if not all, of the articles on ethnography are about the same thing: why you should conduct ethnographic research.


What’s not often talked about is why you should embrace it. So here’s a UX designer’s a definitive guide to applying ethnography.




The true power of ethnography is in its exploration of the social, team, political, and organizational influences that guide the views and decisions made by humans.


It’s rooted in the principle that individual views and decisions are guided by culture as much as they are by, well, anything else. This is an intuitive stance, and it’s heavily applicable to UX design.


Let’s dive in.


1. Data collection


It all starts with data. How do you collect it, analyze it, and apply it?


Let’s address the former first. Data collection in ethnographic research is a very qualitative affair. The typical setup involves a significant amount of observation and note taking, along with the occasional question.


In psychology, ethnographic research is typically conducted as though the researcher were a part of the group that is the subject of scientific study. That is, the researcher should conduct research through the eyes of the subject, so that he or she may paint a more complete picture of the group’s views and beliefs.
And this is true for UX as well.



“Ethnographic research gives you data from your user’s point of view.”


Whether you’re conducting ethnographic research for an app that is going to launch in a foreign country, or you are designing an experience that is specific to a particular region, you should embed yourself into this culture for the duration of your research. 
A good way to do this is to experience firsthand the ways your product may be used in the day-to-day lives of your target users.


For example, if you’re designing an app for users of a certain culture who wish to send money back and forth to their family, you would research the needs and beliefs of that group by engaging in the process within their cultural environment—essentially, living a few days in their shoes while using your app.




2. Turning qualitative data into quantitative data


Along with proper cultural immersion, perhaps the greatest challenge in ethnographic research is converting the qualitative data it generates into quantitative data that can be used to make design decisions.


There are many qualitative measures in UX, so you’re likely to be familiar with this process.


One good way to generate quantitative measures is to categorize the qualitative observations and concerns into relevant and consistent groups. Then you determine the frequency of those measures.


The frequency of each issue type can be plotted on a graph and used to inform countless design decisions. This is called the incidence of an issue.


Another good tool when converting qualitative data into quantitative measures is a confidence interval, which allows you to apply a “confidence percentage” to your data, based on the number of people sampled in a specific data set.


The higher the confidence interval, the more certain you are that the sampled population is fairly represented by the sample data. This can often be useful when justifying design decisions to the stakeholders of a project.


3. Using ethnography to validate or generate assumptions


What good is ethnographic data if you don’t use it to validate or generate assumptions? In fact, the best way to use ethnographic data is to compare it with existing assumptions and develop them into knowns.


So what can we validate through ethnographic data?


First, let’s consider the basics: a group’s feelings and presumptions.



“Ethnographic data helps you generate, challenge, and validate your design assumptions.”


By immersing yourself in a group or culture for a period of time, collecting observational data, and determining the frequency of those observations, the diligent UX researcher can easily determine a group’s feelings and presumptions on a subject.


In turn, knowledge of a group’s feelings and presumptions can be used to confidently inform design decisions.


Another interesting data set that can used to validate assumptions is the incidence of experience roadblocks. That is, how often—and in what form—do roadblocks occur in the experience of your target user base.


A roadblock can be classified as anything that deters a particular group from accomplishing a given task, or interacting with a given system. Knowledge of roadblocks, and specifically their incidence, can be used to help address whatever inefficiencies may be causing them.


Another phenomenal dataset to generate through ethnographic research is motivation and reward within a group. Specifically, what thoughts, feelings, and environmental factors might be causing members of a group to make particular choices?


This information can be critical when designing an experience that must inherently motivate a particular group to interact with the product.


Conclusion: Use ethnographic data as often as possible




As challenging as it can be to research a target audience from its own perspective, the rewards sure do seem to be worth it. The notion of a truly validated assumption, albeit through qualitatively generated data, is a profound one.




By immersing yourself in the culture and lifestyle of a group, you can generate data that can be extremely difficult to collect through surveys and other traditional data collection techniques.

Often, the challenges facing a group are not easily voiced, nor easily communicated. Sometimes the only way to mitigate the challenges facing these traditional methods is to take on the perspective of the user, for as long as it takes to generate enough usable data.


Ethnographic research can be used to generate and validate assumptions that can otherwise be very difficult to prove.


Should you succeed in your mission to collect ethnographic data, you will be rewarded with deep insight into the problems, thoughts, and concerns of your target population to an extent that is rarely attainable in the world of UX.



More posts like this

Yona Gidalevitz
Yona is Codal’s technical researcher. At Codal, he is responsible for content strategy, documentation, blogging, and editing. He works closely with Codal’s UX, development, marketing, and administrative teams to produce all manner of written content. You can check out his work on Codal’s blog, Medgadget’s blog, and Usability Geek. In his free time, Yona is an avid guitarist, cook, and traveler.

from InVision Blog http://blog.invisionapp.com/ux-applied-ethnography/

Unpacking Atlassian’s Acquisition Of Trello

Wrapping up my first CES/Vegas retreat, I boarded the plane to check Twitter to see — lo and behold — that Trello had been acquired by Atlassian for $425M in a great, quick early-stage venture outcome. There’s quite a bit to unpack here, so I’ll just leave a few thoughts here but would love to hear more from the crowd about the implications of this move:

1/ Accidental Happenings and Side Projects: I do not mean to suggest Trello’s success and outcome is accidental, but rather that it doesn’t appear (from afar) that Trello had a normal birth or childhood. Trello was created inside Fog Creek Software, co-founded by Joel Spolsky, and then spun out in 2014 and funded by a mix of seed investors and early-stage VCs. Spolsky became CEO of Stack Exchange and was Chairman of Trello, and I believe another Fog Creek founder ran Trello. As it started to grow, someone else ran Fog Creek. This may be fodder for another post at a later date, as the genesis of this outcome seems both accidental and also a bit looser, more creative than the traditional business rigidity with which we read about in countless startup “how-to” blogs. (Fun update: Per my friend, Sean Rose: “when Trello was still part of Fog Creek, it was funded via Fog Creek employees opting to have their bonuses go to the project.”)

2/ Cross-Platform Architecture, Mobile Card Format, and Business Integrations: Slack launched cross-platform from day one, on web and mobile. I am not exactly sure of Trello’s history — it seems if they were web-first, mobile responsive, and then launched for iOS. Additionally, the interaction model of Trello featured boards (like Pinterest), which displayed nicely as cards in a mobile app. Finally, the Trello team had quietly built many storage and business process integrations into their offering, giving some of them away as a hook and charging larger teams for the privilege to stack them up. (Trello also didn’t have thousands of integrations, but enough to make customers happy — more integrations likely doesn’t mean they’re all useful.)

3/ Consumerization of Enterprise: This has been an “eye-rolling” buzzword, but we have to accept it is an apt descriptor. Following the success of prosumer designs in apps like Slack, Asana, Wunderlist, and others (more on this below), Trello’s design delivers a lightweight experience to users with enough infrastructure and power to fuel large teams across many different platforms. Trello simply feels like a consumer product, something that may have been designed inside Google or Facebook — but much better, cleander.

4/ Capital Efficiency: Assuming Crunchbase and my sources are correct, Trello is (relatively) a modern case study in capital efficiency. Having only raised about ~$10M, Trello seemed to not only grow its team (over 100) and its user base (19M+) quickly, they also marketed a three-tier freemium product that charged more to small businesses and even more for enterprise customers. In VC-math terms, Trello likely produced a 8.5x realized (mostly in cash) exit for its investor in less than three (3) years (which positively impacts IRR) and didn’t have to raise round after round of capital. Compared to some of its peer products like Asana and Wunderlist, among others, Trello has been remarkably capital efficient relative to its exit value. (A reader notes that it’s spinout from Fog Creek also adds to its capital efficiency.)

5/ Enterprise SaaS consolidation: For years now, we have witnessed different varieties of M&A across enterprise SaaS, whether it’s an incumbent like Salesforce scooping up new products or private equity shops buying small-cap public companies, there’s more and more pressure in the environment for the larger companies to expand their offerings to grow, as well as financial incentives for buyouts led by managers who can profit from creatively rolling-up disparate end-point solutions. In a world where collaborative products like Slack or Facebook @ Work or Microsoft Teams are growing and/or boast infinite financial resources, other growing incumbents (like Atlassian) need to prepare for a long-term product and mindshare battle and scooping up Trello is a good step in that direction. As Fred Wilson predicted a few days ago for 2017, “The SAAS sector will continue to consolidate, driven by a trifecta of legacy enterprise software companies (like Oracle), successful SAAS companies (like Workday), and private equity firms all going in search of additional lines of business and recurring subscription revenue streams.”

6/ “If I Can Make It There, I’ll Make It Anywhere” – Another solid exit for the NYC startup market, and there are bigger ones to come. Despite Trello being young and a SMB/enterprise product from NYC, it recently internationalized to a few non-English-speaking markets worldwide. As a bonus, while I don’t know the team, from what I hear from friends, Spolsky, Pryor, and their team are well-respected and seem to have done things the right way — their way. Congrats on building a great product.

from Haystack http://blog.semilshah.com/2017/01/10/unpacking-atlassians-acquisition-of-trello/