Categories
Big Data Data Analytics

New IT Process Automation: Smarter and More Powerful

If we think of the newest trends in IT service automation, or try to follow the recent research, or listen to the particular tops speakers at conferences and meetups — they all will inevitably point out that automation increasingly relies on Machine Learning and Artificial Intelligence.

It may sound like the case when these two concepts are used as buzzwords to declare that process automation follows the global trends. It is partially true. In theory, machine learning can enable automated systems to test and monitor themselves, to provide additional resources when necessary to meet timelines, because well as retire those resources when they’re no longer needed, and in this way to enhance IT processes plus software delivery.

Artificial Intelligence in turn refers in order to completely autonomic systems, that can interact with their surroundings at any situation and reach their goals independently.

However , most organizations are in very early days in terms of actual implementations of such solutions. The idea lying behind the need for AI and related technologies will be that many decisions are still the responsibility of the particular developers in spheres that can be effectively addressed by adequate training of computer systems. For example, it is the developer who decides what needs to be executed, but identifying…

Read A lot more on Dataflow

Categories
Big Data Data Analytics

Setting up an Analytics Team for Success = Get Fuzzy!

Building on our month focussed on controversial topics, let’s turn to what will set your team up for success.

Different contexts can require different types of the analytics team. A lot of the advice that I offer within the Opinion section of this blog is based on a lifetime leading teams in large corporates. So , I’m pleased to partner with guest bloggers from other settings.

So, over to Alan to explain why getting “fuzzy” is the way for an analytics team to see success in the world of startups…

Get fuzzy! Why it will be needed

My co-founders and I have recently had to face up to this challenge of creating a new data analytics team having set up our new firm Vistalworks, earlier in 2019. Thinking about this challenge, reflecting on what we know, and getting the right answer (for us) has been an enlightening process.

With 70-odd years of experience between us, we have plenty of examples of what not to do in data analytics groups, but the really valuable question has been what should we do, and exactly what conditions we should set up in order to give our new team the best chance to be successful.

As all of us talked through this issue my main personal observation was that successful data analytics teams, of whatever size, have…

Read More on Dataflow

Categories
Big Data Data Analytics

How Big Data is Changing The Way We Fly

Airline big data, combined with predictive analytics is being used to drive up airline ticket prices.

As airlines and their frequent flyer programs gather more intelligence on your day to day lifestyle, flying and financial position – they begin to build an airline big data profile.

Consumer interests, goals, psychometric assessment, your motivations in order to engage with a brand at any given every point throughout the day, what has driven you to purchase in the past – and most importantly – where your thresholds are.

To illustrate how data is playing a growing role within today’s flight booking engines I’ve broken down play by play how each piece of data collected about you can be used, analysed plus overlaid with other datasets to paint a picture of who you are, exactly what motivates and drives you to purchase a particular product.

Every day – trillions of calculations are number-crunched to transform this goldmine of data opportunity into real, tangible high-revenue opportunities for the airlines and their frequent flyer programs.

“When armed with key insights, a holistic overview associated with yours, and other customers’ detailed profiled information can be applied to direct booking channels which are designed to customize pricing for your personal situation at that very given moment. Here is…

Read More on Dataflow

Categories
Big Data Data Analytics

Deep Learning: Past and Future

Heavy learning is growing in both popularity plus revenue. In this article, we will shed light on the different milestones that have led to the deep learning field we know today. Some of these events include the introduction of the initial neural network model in 1943 and the first use of this technology, in 1970.

We will certainly then address more recent achievements, starting with Google’s Neural Machine Translation and moving on to the lesser known innovations such as the Pix2Code – an application that is used to generate a specific layout code to defined screenshots with 77% accuracy.

Towards the end of the article, we will briefly touch on automated learning-to-learn algorithms and democratized deep learning (embedded deep studying in toolkits).

The Past – An Overview associated with Significant Events

1943 – The Initial Mathematical Model of a Neural Network

For deep learning to develop there needed to be an established understanding of the neural networks in the human brain.

A logician and a neuroscientist – Walter Pitts plus Warren McCulloch respectively, created the first neural network mathematical model. Their work, ‘A logical Calculus of Ideas Immanent in Nervous Activity’ was published, and it put forth a combination of algorithms plus mathematics that were aimed at mimicking…

Read More on Dataflow

Categories
Big Data Data Analytics

9 Reasons Smart Data Scientists Don’t Touch Personal Data

The production of massive amounts associated with data as a result of the ongoing ‘Big Data’ revolution has transformed data analysis. The availability of analysis tools and decreasing storage costs, allied with a drive-by business to leverage these datasets with purchased and publicly available data can bring insight and monetize this new resource. This has led to an unprecedented amount of data about the personal attributes of individuals being collected, stored, and lost. This data is valuable for evaluation of large populations, but there are a considerable number of drawbacks that information scientists and developers need to consider in order to use this data ethically.

Here are just a few considerations to take into account before ripping open the predictive toolsets from your cloud provider:

1 . Contextual Integrity

Data is gathered over different contexts which have different reasons and permissions for capture. Ensure that the data you capture is valid for that context plus cannot be misused for other purposes. There could be unintended side effects of mixing public and personal data. An example is notifying other parties associated with location data without consent, as there are numerous examples of stalkers using applications to track others.

2 . History Aggregation

History is an important part of many efforts to defining…

Read More on Dataflow

Categories
Big Data Business Intelligence Data Analytics

What Does the Salesforce-Tableau Deal Mean For Customers?

Salesforce Buying Tableau for $15.7 Billion

Salesforce will buy Tableau Software for $15.7 billion in an all-stock deal announced Monday morning. Salesforce is doubling down on data visualization and BI in the purchase of one of the top enterprise technology brands.

The all-stock deal will be the largest acquisition in the history of the San Francisco-based cloud CRM giant. It is more than double the amount Salesforce paid for MuleSoft last year ($6.5 billion).

The acquisition price of $15.7 billion is a premium of more than 30 percent over Tableau’s market value of $10.8 billion as of the previous stock market close. The deal is slated to close in the third quarter. The boards of both companies have approved the acquisition, according to the announcement.

The acquisition comes barely a weekend-after Google announced its massive $2.6 billion acquisition of Looker, which also makes data visualization software for businesses.

The deal is also expected to escalate the competition between Salesforce and Microsoft. The two are already fierce competitors in the CRM arena with Salesforce CRM and Microsoft Dynamics CRM. Salesforce, armed with the Tableau product suite, will now compete with Microsoft’s PowerBI data visualization and business intelligence technology. Tableau and Microsoft have been in a fierce fight the last three years, with Tableau’s stock under pressure.

At $15.7 billion, Salesforce buying Tableau is the largest analytics merger and one of the largest software deals in history.

It combines two leaders in their respective space, Tableau for Data Visualization, and Salesforce, leader in Customer Relationship Management SaaS software.

It’s not surprising Salesforce wanted Tableau. Salesforce, like any other large Saas company, stores a massive amount of business data supplied by its thousands of customers. Naturally, those customers are hungry for advanced analytics on that data, and have been telling Salesforce that.

The risk for Salesforce and the massive amount of data it holds is letting that data flow out of its systems to those of competitors – not for new CRM services – but for Analytics.

Customers desiring analytics for Salesforce Data have a multitude of choices, major players like Microsoft’s PowerBI or any of the hundreds of other analysis platforms. Google searches for “CRM Data Analytics” and its variants number in the thousands per day.

Over the past few years, it’s swallowed Analytics companies like goldfish at a 50’s frat party. Salesforce acquisitions in just the last 2 years included:

  • Mulesoft,
  • BeyondCore,
  • PredictionIO,
  • Griddable.io,
  • MapAnything.

Why is Salesforce Investing in Analytics?

Because data has massive value, both current and potential value in the future. Salesforce knows whoever controls the data inherits that value, and has much greater influence over the customer.

Salesforce isn’t the only one who knows this, many other cloud and SaaS players know this too. The new cloud “land-grab” is actually a data grab, which may prove much more valuable than land over time. Cloud companies are doing everything they can to direct as much data into their clouds, and keep it there. Analytics services a way to keep their customers’ data happily ensconced within their own platform.

In the cloud universe, it’s much better to be a massive player with a strong gravitational pull that draws data toward you, than to see data flowing away from you. That may sound simplistic, but that glacial flow of data, first from the company, then into a SaaS application, then onward to other cloud companies, is what makes or breaks these companies’ fortunes.

Salesforce has turned most of its purchases in Data Analytics into the Einstein platform, which has had a decent reception by the market. However, Einstein has not had the planetary effect of drawing in non-Salesforce data and exists mainly to offer insights on Salesforce’s captive CRM data. Its adoption has not broadened significantly beyond Salesforce data.

The acquisition of BeyondCore promised augmented analytics into the portfolio by way of Salesforce Einstein Discovery. In this regard, the Tableau acquisition is good for Salesforce from a product perspective, while also a good move for Tableau shareholders.

There is some obvious overlap in the product portfolios. Tableau had acquired Emperical Systems to bolster its augmented analytics, which will likely be slowed or sidelined. The immediate goal for Salesforce and Tableau will be to rationalize duplicate products and improve the integration. We wonder whether Tableau will become the face of the Salesforce analytics apps, which are full cloud products, since Tableau has continued to lag in its browser-based authoring. All this means that it is not necessarily good news for Tableau customers. The reactions on Twitter were decidedly mixed.

Winners and Losers: What does the Salesforce-Tableau deal mean for customers?

Definite Winner: Tableau Shareholders

Potential Winner: SalesForce Customers

Potential Losers: Tableau Customers, Salesforce Shareholders

The initial reaction in markets and on Twitter was strong. Markets soundly rewarded Tableau shareholders with a 35% share price leap the morning the news came out. Salesforce shareholders didn’t fare so well, with their shares dropping 8% on the announcement, but will likely recover as the news spreads.

Both companies have strong, mature cultures. Tableau was multi-platform and connected to multiple datasets. Salesforce, which did buy Mulesoft to connect to other data sources, is likely to maintain Tableau’s mission and approach, but it’ll have to prove it to some folks. However, Tableau has built up a very successful community around its brand, and includes millions of loyal users among its fanbase.

One response on the Tableau community forum likely sums up the concerns by some customers:

“Will we wake up on this date next year and see ‘Tableau Powered by Salesforce,’ and then the next year Tableau becomes nothing more than a checkbox on the Salesforce contract? I have staked my career on this wonderful tool the past few years and truly love it. I just don’t want to see it ruined or fade off into the sunset.”

It will be interesting to watch how Tableau’s roadmap evolves or changes due to its new ownership.

These two deals are just the latest in a series of acquisitions of data analytics companies over the past quarter or two. We’ll cover the others in Part II of this post.

For now, here are some takeaways about all these acquisitions:

  • The Analytics and BI market remains hot, valuations for these companies continue to go up.
  • It’s clear that most of the benefits of these deals will go to the shareholders. However, the CEOs and boards should also be doing their part to make sure the benefits are shared with the customers and loyal users of these technologies. After all, that’s what got them where they are.
  • This isn’t the first consolidation the Analytics industry has seen. In the late 2000s there was a wave of activity as behemoths like SAP, IBM and Oracle gobbled up Business Objects, Cognos and Hyperion, respectively. How did those turn out? Well, the fact that companies like Tableau were born shortly afterward signals that innovation in the bigger companies slowed down after those deals. This paved the way for newer, more agile companies (like Tableau) who listened to the market, and innovated to deliver what it demanded.

If you have a horse in this race, either as a customer, developer or employee of any of the affected companies, drop us a quick comment below to let us know how you’re feeling about this news, and how you think it might affect you.

Categories
Business Intelligence Data Quality

The 3 Things You Need To Know If You Work With Data In Spreadsheets

Microsoft Excel and Google Sheets are the first choice of many users when it comes to working with data. They’re readily available, easy to learn and support universal file formats. When it comes to using a spreadsheet application like Excel or Google Sheets, the point is to present data in a neat, organized manner which is easy to comprehend. They’re also on nearly everyone’s desktop, and were probably the first data-centric software tool any of us learned.  Whether you’re using Excel or Google Sheets, you want your data cleaned and prepped. You want it accurate and consistent, and you want it to elegant, precise, and user-friendly.

But there is a downside. While spreadsheets are popular, they’re far from the perfect tool for working with data. We’re going to explore the Top 3 things you need to be aware of if you work with data in spreadsheets.

While spreadsheet tools are quite adequate for many small to mid-level data chores, there are some important risks to be aware of. Spreadsheets are desktop-class, file-oriented tools which means their entire data contents are stored in volatile RAM while in use and on disk while you’re not using them. That means that between saves, the data is stored in RAM, and can be lost.

Risk #1: Beware of Performance and Data Size Limits in Spreadsheet Tools

Most people don’t check the performance limits in Spreadsheet tools before they start working with them. That’s because the majority won’t run up against them. However, if you start to experience slow performance, it might be a good idea to refer to the limits below to measure where you are and make sure you don’t start stepping beyond them. Like I said above, spreadsheet tools are fine for most small data, which will suit the majority of users.

But at some point, if you keep working with larger and larger data, you’re going to run into some ugly performance limits. When it happens, it happens without warning and you hit the wall hard.

Excel Limits

Excel is limited to 1,048,576 rows by 16,384 columns in a single worksheet.

  • A 32-bit Excel environment is subject to 2 gigabytes (GB) of virtual address space, shared by Excel, the workbook, and add-ins that run in the same process.
  • 64-bit Excel is not subject to these limits and can consume as much memory as you can give it. A data model’s share of the address space might run up to 500 – 700 megabytes (MB), but could be less if other data models and add-ins are loaded.

Google Sheets Limits

  • Google Spreadsheets are limited to 5,000,000 cells, with a maximum of 256 columns per sheet. (Which means the rows limit can be as low as 19,231, if your file has a lot of columns!)
  • Uploaded files that are converted to the Google spreadsheets format can’t be larger than 20 MB and need to be under 400,000 cells and 256 columns per sheet.

In real-world experience, running on midrange hardware, Excel can begin to slow to an unusable state on data files as small as 50mb-100mb. Even if you have the patience to operate in this slow state, remember you are running at redline. Crashes and data loss are much more likely!

(If you’re among the millions of people who have experienced any of these, or believe you will be working with larger data, why not check out a tool like Inzata, designed to handle profiling and cleaning of larger datasets?)

Risk #2:  There’s a real chance you could lose all your work just from one mistake

Spreadsheet tools lack any auditing, change control, and meta-data features that would be available in a more sophisticated data cleaning tool. These features are designed to act as backstops for any unintended user error. Caution must be exercised when using them as multiple hours of work can be erased in a microsecond.

Accidental sorting and paste errors can also tarnish your hard work. Sort errors are incredibly difficult to spot. If you forget to include a critical column in the sort, you’ve just corrupted your entire dataset. If you’re lucky enough to catch it, you can undo it, if not, that dataset is now ruined, along with all of the work you just did. If the data saves to disk while in this state, it can be very hard, if not impossible, to undo the damage.

Risk #3:  Spreadsheets Aren’t Really Saving You Any Time

Spreadsheets are fine if you just have to clean or prep data once, but that is rarely the case. Data is always refreshing, new data is coming online. Spreadsheets lack any kind of repeatable processes and or intelligent automation.

If you spend 8 hours cleaning a data file one month, you’ll have to repeat nearly all of those steps the next time a refreshed data file comes along.

Spreadsheets can be pretty dumb sometimes. They lack the ability to learn. They rely 100% on human intelligence to tell them what to do, making them very labor intensive.

More purpose-designed tools like Inzata Analytics allow you to record and script your cleaning activities via automation. AI and Machine Learning lets these tools learn about your data over time. If you Data is also staged throughout the cleaning process, and rollbacks are instantaneous. You can set up data flows that automatically perform cleaning steps on new, incoming data. Basically, this lets you get out of the data cleaning business almost permanently.

(Excerpt from The Ultimate Guide to Cleaning Data in Excel and Google Sheets)

Categories
Artificial Intelligence

“Hey Siri, Define Artificial Intelligence and Machine Learning”

Although one would assume these two highly used key terms would be well known and well defined in their respective industries/departments, they are surprisingly not. Artificial intelligence and machine learning, although similar, are quite different in many aspects, and a clear definition of each seems to be… necessary.

What is Artificial Intelligence?

The official definition of Artificial Intelligence (AI) reads, the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

In simpler terms, Artificial Intelligence seeks to imitate human intelligence. Using a process called statistical learning, Artificial Intelligence is able to receive and process information. There are numerous types of artificial intelligence that include Narrow AI and Strong AI. Narrow AI is a specific type of AI that is used to perform a narrow task. They are also known as Weak AI. Programmed to perform a single task, they lack the self-awareness and  consciousness to perform Intelligent tasks. Strong Artificial Intelligence are typically types of AI that can impersonate actual human intelligence. They can think and perform tasks on their own just like a human being. Strong AI is also known as Artificial General intelligence. Strong AI are distinctive in that they are self-aware and conscious to make decisions.

What is Machine Learning?

Machine Learning (ML) is the process of a computer reprogramming itself to perform more accurately and effectively based on statistical values that it picks up on. For example, if you wanted a computer that could tell the difference between dogs and cats, you could show it a few pictures of each and tell it whether the picture is a dog or a cat. The computer would pick up on details and differences that it notices, and the more pictures it sees, the more it will learn, and the better it will be at identifying the picture. Currently Machine Learning is being used for a wide variety of things in our everyday lives. Machine Learning is responsible for predictive texts on your phone, recommendations on music and movie streaming services, facial recognition, and spam filters on your email to name a few. Machine Learning is important because it makes it possible to quickly and automatically produce models that can analyze larger, more complex data and deliver faster and more accurate results even on a giant scale. Also, by building precise models, any organization has a better chance of identifying profitable opportunities and avoiding unknown risks.

How is Artificial Intelligence Being Used Today?

Artificial Intelligence is a growing technology that has found itself being used in many industries for many different purposes. Some popular examples of Artificial Intelligence are Apple’s Siri, Amazon Alexa, Google’s Assistant, Netflix’ recommendation algorithm, and Nest thermostats along with other companies incorporating AI in their products. AI is also being used in self-driving cars, like Tesla and Mercedes-Benz in addition to the automotive industry in its entirety whether that be in vehicles brake/crash avoidance detection.

Artificial Intelligence is being used in many ways in the workplace, often in ways that people don’t even realize. AI is also commonly used in customer support, security systems, and to automate many tasks that people take for granted.

What is the Future of Artificial Intelligence?

The future of Artificial Intelligence is extremely broad and could present many new and creative outlets to provide humans with a better quality of life and enabling humans to be able to multitask in ways we’ve never done before. This could include, but is not limited to, fully robotically-controlled assembly lines, Autonomous household appliances that could make meals and wash dishes without a user ever interfering with said AI. Other outlets of Artificial Intelligence could be automating hospitals and other first responder services like police or firefighters. Artificial Intelligence could reduce the risk of first responders getting injured and could potentially increase the ability for early detection on threats or natural disasters.

Categories
Big Data Data Analytics

NLP vs. NLU: from Understanding a Language to Its Processing

As artificial intelligence progresses and technology becomes more sophisticated, we expect existing concepts to embrace this change — or change themselves. Similarly, in the domain of computer-aided processing of natural languages, shall the concept of natural language processing give way to natural vocabulary understanding? Or is the relation between the two concepts subtler and a lot more complicated that merely linear progressing of a technology?

In this post, we’ll scrutinize over the ideas of NLP plus NLU and their niches in the AI-related technology.

Importantly, though sometimes used interchangeably, they are actually two different concepts that have some overlap. First of all, they both deal with the relationship among a natural language and artificial intelligence. They both attempt in order to make sense associated with unstructured data, like language, as opposed to structured data like statistics, actions, etc. However , NLP and NLU are opposites of a lot of other information mining techniques.

Source: Stanford

Natural Language Processing

NLP is an already well-established, decades-old field operating at the cross-section of computer science, synthetic intelligence, and increasingly data mining. The ultimate of NLP is to read, decipher, understand, plus make sense of the human languages by machines, taking certain tasks off the humans and allowing for a machine to handle them instead. Common real-world examples…

Read More on Dataflow

Categories
Big Data Data Analytics

The Advantages of Automation for Wastewater Treatment

Automation is everywhere these days, helping us to make better use of labor, time and other resources, plus leading to the development of cleaner, future-ready industries. It’s not a surprise to see automation edging into wastewater treatment: this is one of the most critical industrial-level activities on planet earth today where public health is concerned. Here are some of the ways technology is making this process more efficient and cost-effective.

Lower Energy Costs

Not surprisingly, energy use is the single biggest expense for wastewater treatment plants. Automating infrastructure provides 1 way to reduce energy expenditures associated with a number of critical water treatment processes. One example is the blowers located in holding basins, which keep the water aerated. Some estimates say blowers account for up to 60% of a treatment plant’s total energy consumption.

Automation can improve cost-effectiveness in this area through data collection. Instead associated with operating the blowers constantly, at the fixed speed, plants can use information about effluent levels in holding basins to apply air and remove solids only when it’s necessary to do so. This reduces energy costs, helps maintain a steady flow and reduces wear and tear on equipment.

Constant Access to Data and Ongoing Sampling

In wastewater treatment and many…

Read More on Dataflow

Polk County Schools Case Study in Data Analytics

We’ll send it to your inbox immediately!

Polk County Case Study for Data Analytics Inzata Platform in School Districts

Get Your Guide

We’ll send it to your inbox immediately!

Guide to Cleaning Data with Excel & Google Sheets Book Cover by Inzata COO Christopher Rafter