SlideShare a Scribd company logo
1 of 62
Download to read offline
Text/Content Analytics 2011:
   User Perspectives on
  Solutions and Providers

                       Seth Grimes


           An Alta Plana research study
                   Sponsored by




  Published September 9, 2011 under the Creative Commons Attribution 3.0 License.
Text/Content Analytics 2011: User Perspectives



Table of Contents
Executive Summary ............................................................................................................................................ 3
     Market Size and Growth............................................................................................................................. 3
     Growth Drivers ........................................................................................................................................... 3
     The 2011 Market ........................................................................................................................................ 4
     The Study.................................................................................................................................................... 4
     Key Study Findings...................................................................................................................................... 4
     About the Study and the Report ................................................................................................................ 5
Text and Content Analytics Basics ...................................................................................................................... 6
     From Patterns… .......................................................................................................................................... 6
     … To Structure ............................................................................................................................................ 7
     Beyond Text................................................................................................................................................ 7
     Metadata .................................................................................................................................................... 7
     A Focus on Applications ............................................................................................................................. 7
Applications and Markets .................................................................................................................................. 8
     Application modes...................................................................................................................................... 8
     Business Domains ....................................................................................................................................... 8
     Business Functions ..................................................................................................................................... 9
     Technology Domains ................................................................................................................................ 10
     Solution Providers .................................................................................................................................... 12
Demand-Side Perspectives ............................................................................................................................... 13
     Study Context ........................................................................................................................................... 13
     About the Survey ...................................................................................................................................... 13
     Market Size and the Larger BI Market ...................................................................................................... 15
     The Data Mining Community ................................................................................................................... 16
Demand-Side Study 2011: Findings .................................................................................................................. 17
     Q1: Length of Experience ......................................................................................................................... 17
     Q2: Application Areas ............................................................................................................................... 18
     Q3: Information Sources .......................................................................................................................... 19
     Q4: Return on Investment ........................................................................................................................ 21
     Q5: Mindshare.......................................................................................................................................... 22
     Q6: Spending ............................................................................................................................................ 23
     Q8: Satisfaction ........................................................................................................................................ 23
     Q9: Overall Experience ............................................................................................................................. 25
     Q10: Providers .......................................................................................................................................... 28
     Q11: Provider Selection ............................................................................................................................ 29
     Q13: Promoter? ........................................................................................................................................ 31
     Q14: Information Types ........................................................................................................................... 32
     Q15: Important Properties and Capabilities ............................................................................................ 32
     Q16: Languages ........................................................................................................................................ 34
     Q17: BI Software Use ............................................................................................................................... 35
     Q18: Guidance .......................................................................................................................................... 36
     Q19: Comments ....................................................................................................................................... 39
     Additional Analysis ................................................................................................................................... 40
     Interpretive Limitations and Judgments .................................................................................................. 42
About the Study ............................................................................................................................................... 43
Solution Profile: AlchemyAPI ............................................................................................................................ 45
Solution Profile: Attensity ................................................................................................................................. 47
Solution Profile: Basis Technology .................................................................................................................... 49
Solution Profile: Language Computer Corp. ..................................................................................................... 51
Solution Profile: Lexalytics ................................................................................................................................ 53
Solution Profile: Medallia ................................................................................................................................. 55
Solution Profile: SAS ......................................................................................................................................... 57
Solution Profile: Sybase .................................................................................................................................... 59
Solution Profile: Verint Systems Inc.................................................................................................................. 61



                                                                                                                                                              2
Text/Content Analytics 2011: User Perspectives



Executive Summary
       Text and content analytics have become a source of competitive advantage,
       enabling businesses, government agencies, and researchers to extract
       unprecedented value from “unstructured” data. Uptake is strong – software,
       solutions, and services are delivering significant business value to users in a
       spectrum of industries – yet the potential of the market remains unreached.
       These points and more are brought out in Alta Plana’s market study,
       “Text/Content Analytics 2011: User Perspectives on Solutions and Providers.”
                                                                       Market Size and Growth
       Tools and solutions now cover the gamut of business, research, and governmental
       needs. User adoption continues to grow at a very rapid pace, an estimated 25% in 2010,
       creating an $835 million market for software tools, business solutions, and vendor
       supplied support and services. These tools and solutions generate business value several
       times that figure, extrapolating from revenue generated by applications and solutions (for
       instance, social-media analysis, e-discovery, and search), information products created by
       mining content, professional services, and research.
       The addressable market for text/content analytics is much larger. The technologies are a
       subset of a larger business intelligence, analytics, and performance management software
       market, which is dominated by solutions that analyze numerical data that originates in
       enterprise operational systems. Gartner estimated that larger market at $10.5 billion
       globally in 2010. Yet, given now-broad awareness of the business value that resides in
       “unstructured” social, online, and enterprise sources, text/content-analytics’ share of the
       much larger market will surely grow steeply in coming years. Overall, expect annual
       text/content-analytics growth averaging up to 25% for the next several years.
                                                                                 Growth Drivers
       A number of factors contribute to sustained growth, foremost the growth of social
       platforms, which have become essential life tools for individuals and an important
       business marketing, communication, research, and commerce channel.
       Social
       Keeping up with Social is a must for every consumer-facing organization, and automated
       monitoring, measurement, and engagement is the only way to deal with Social’s variety,
       volume, and velocity. Leading solutions rely on natural-language processing, provided
       by text/content analytics, to identify and extract facts and sentiment. Expect even
       lower-end tools to embrace NLP by 2013.
       Publishing, advertising, and information services
       Second, text/content analytics is central to competitive online publishing and
       advertising to effective information access (essentially, next-generation search). These
       are two sides of a single coin. As applied by content producers and publishers,
       technologies discover and associate appropriate descriptive and semantic labels with
       content. The aims are to optimize search findability, to allow content to be stored and
       retrieved at a fine-grained level (documents as databases), and to enhance the content
       consumer’s experience interacting with content. As applied by search, content
       aggregation, online advertising, and information-service providers, the technology fuels
       situationally appropriate results that respond to the information/service seeker’s context
       and intent.



                                                                                               3
Text/Content Analytics 2011: User Perspectives


           Question-answering and information access
           Question-answering systems such as IBM Watson and Wolfram Alpha are examples of
           next-generation, analytics-enabled information-access engines, which will play a key role
           in online commerce, customer support, health-service delivery, and other applications
           starting by early 2013. Similarly, Semantic Web information resources should finally enter
           the mainstream by 2014. They will very frequently rely on analytics to semanticize and
           structure content and support on-the-fly information integration.
           Rich media
           Last, content analytics makes sense of rich media. The technology finds and exploits
           patterns – what’s in a given piece of content and how the content of content changes
           over time – in speech and sound, images, and video. There are important today content-
           analytics applications for contact centers, security, general information access, and even
           in consumer electronics: Witness face detection and tracking in consumer-grade cameras
           and camcorders. Arguably, we could include analyses of social and enterprise network,
           mined from e-mail, messaging, online, and social content, under the content-analytics
           umbrella.
                                                                                       The 2011 Market
           As in prior years, no single solution provider dominates the market. Players range from
           the largest enterprise software vendors to a stream of new entrants, both
           commercializing research technologies and bringing solutions to new markets. In
           between, established enterprise content management (ECM), BI and analytics, search,
           software tools, and business-solution providers – the sponsors of this study among them –
           continue to innovate and deliver business value.
                                                                                                The Study
           Alta Plana’s 2011 text/content analytics market study combines a survey-based,
           quantitative and qualitative examination of usage, perceptions, and plans with
           observations derived from numerous conversations with solution providers and users. It
           seeks to answer the question, “What do current and prospective text/content-analytics
           users really think of the technology, solutions, and solution providers?” Responses will
           help providers craft products and services that better serve users. Findings will guide
           users seeking to maximize benefit for their own organizations.
           Alta Plana received 224 valid survey responses between June 6 and July 9, 2011. This
           document reports findings and when appropriate, contrasts them with comparable
           numbers from Alta Plana’s spring-2009 text-analytics market study.1
                                                                                     Key Study Findings
           The following are key 2011 study findings:
                  The big news is not news at all: Social is by far the most popular source fueling
                   text/content analytics initiatives. Four of the top 5 information categories are
                   social/online (as opposed to in-enterprise) sources:
                       o blogs and other social media (62%)
                       o news articles (41%)
                       o on-line forums (35%)
                       o reviews (30%)


1
    “Text Analytics 2009: User Perspectives on Solutions and Providers”: http://altaplana.com/TA2009


                                                                                                       4
Text/Content Analytics 2011: User Perspectives


               as well as direct customer feedback in the form of:
                   o   customer/market surveys (35%)
                   o   e-mail and correspondence (29%)
               for an average of 4.5 sources per respondent.
              All three top capabilities that users look for in a solution, each garnering over 50%
               response, relate to getting the most information out of sources:
                    o Broad information extraction capabilities (63%)
                    o Ability to use specialized dictionaries, taxonomies, ontologies, or
                        extraction rules (57%)
                    o Deep sentiment/emotion/opinion extraction (57%)
               Low cost dropped from 51% of 2009 responses to 38% in 2011.
              Top business applications of text/content analytics for respondents are the
               following:
                    o Brand / product / reputation management (39% of respondents)
                    o Voice of the Customer / Customer Experience Management (39%)
                    o Search, Information Access, or questions Answering (36%)
                    o Competitive intelligence (33%)
              Seventy percent of users are Satisfied or Completely Satisfied with text/content
               analytics and 24% are Neutral with only 7% Disappointed or Very Disappointed.
               Dissatisfaction is greatest, at 25%, with ease of use, with only 36% satisfied. Only
               42% are satisfied with availability of professional services/support.
              Only 49% of users are likely to recommend their most important provider. 28%
               would recommend against their most important provider.
                                                               About the Study and the Report
       Seth Grimes, an industry analyst and consultant who is a recognized authority on the
       application of text analytics, designed and conducted the study “Text/Content Analytics
       2011: User Perspectives on Solutions and Providers” and wrote this report.
       The author is grateful for the support of the nine study sponsors, Verint, Sybase, SAS,
       Medallia, Lexalytics, Language Computer Corporation, Basis Technology, Attensity, and
       AlchemyAPI. Their sponsorships allowed him to conduct an editorially independent study
       that should promote understanding of the text/content analytics market and of user-
       indicated implementation and operations best practices. The solution profiles that follow
       the report’s editorial matter were provided by the sponsors and included with only minor
       editing for to regularize their layout. Otherwise, the author is solely responsible for the
       editorial content of this report, which was not reviewed by the sponsors prior to
       publication.




                                                                                                5
Text/Content Analytics 2011: User Perspectives



Text and Content Analytics Basics
       The term text analytics describes software and transformational processes that uncover
       business value in “unstructured” text via the application of statistical, linguistic,
       machine learning, and data analysis and visualization techniques. The aim is to improve
       automated text processing, whether for search, classification, data and opinion
       extraction, business intelligence, or other purposes.
       Rough synonyms include text mining, text ETL, and semantic analysis. Terminology
       choices are typically rooted in history and competitive positioning. Text mining is an
       extension of data mining and text ETL of the BI world’s extract-transform-load concept.
       Semantic analysis seems most often used by Semantic Web aficionados, who sometimes
       use the broader term Semantic Web technologies, which also covers protocols such as
       RDF, triple stores, query systems, and the like.
       These text technologies all perform some form of natural language processing (NLP).
       Content analytics can and should be seen as an extension of capabilities to also cover
       images, audio and speech, video, and composites, the gamut of information types not
       generated or held in data fields. (Some organizations use the content analytics label for
       text analytics on online, social, and enterprise content, typically, published information.
       These organizations most often have a strong focus on enterprise content management
       (ECM) systems.)
                                                                                  From Patterns…
       Text, images, speech and other audio, and video are all directly understandable by
       humans (although not universally: Any given human language – English, Japanese, or
       Swahili – is spoken by a minority of people, and not everyone recognizes a Beethoven
       symphony or Nelson Mandela in a photo). Understanding relies on three capabilities:
             1) Ability to recognize small- and large-scale patterns.
             2) Ability to grasp context and, from context, to infer meaning.
             3) Ability to create and apply models.
       Descriptive statistics provides an NLP starting point: The most frequently used words and
       terms give an indication of the topics a message or document is about. We can create
       categories and classify text (a form of modeling) based on notions of statistical similarity.
       Next steps take advantage of the linguistic structure of text, detectable by machines as
       patterns. We have word form (“morphology”) and arrangement (grammar and syntax) as
       well as higher-level narrative and discourse. Usage may be correct (as judged by editors,
       grammarians, and linguists) or not, whether the language is spoken, formally written, or
       texted or tweeted: The most robust technologies deal with text in the wild. We apply
       assets such as lexicons of “named entities”; part-of-speech resolution that can help
       identify subject, object, relationship, and attributes; and “word nets” that associate words
       to help in disambiguation, determination of the contextual sense of terms that may have
       different meanings in different contexts.
       Yet, in the words of artificial-intelligence pioneer Edward A. Feigenbaum,
             “Reading from text in general is a hard problem, because it involves all of
              common sense knowledge. But reading from text in structured domains, I
              don’t think is as hard.”
       So some techniques (also) apply knowledge representations such as ontologies to the
       analysis task. All techniques, however, aim to generate machine-processable structure.


                                                                                                  6
Text/Content Analytics 2011: User Perspectives


                                                                                  … To Structure
       NLP outputs, as part of a text-analytics system, are typically expressed in the form of
       document annotations, that is, in-line or external tags that identify and describe features
       of interest. Outputs may be mapped into machine-manageable data structures whether
       relational database records or in XML, JSON, RDF, or another format.
       Text-extracted data represented in the Semantic Web’s Resource Description Framework
       (RDF) may form part of a Linked Data system. Text-derived information stored in a
       relational database may become part of a business intelligence system that jointly
       analyzes, for instance, DBMS-captured customer transactions and free-text responses to
       customer-satisfaction surveys. And text-extracted features such as entities, topics, dates,
       and measurement units may form the basis of advanced semantic search systems.
                                                                                     Beyond Text
       Beyond-text technologies for information-extraction from images, audio, video, and
       composite media exist but do not match NLP’s sense-making capabilities. Likely most
       developed is speech-analysis technology that supports indexing and search using
       phonemes and is capable of detecting emotion in speech via analysis of indicators such as
       pace, volume, and intonation with contact-center and others applications that include
       intelligence. Intelligence, along with consumer and social search, motivates work on
       image analysis, as do marketing and competitive-intelligence related studies of online and
       social brand mentions and use. Video analytics extends both speech and image analysis,
       with an added temporal aspect, for security applications and also potential business uses
       such as study of customer in-store behavior.
       For beyond-text media, as for text, metadata is of critical importance.
                                                                                        Metadata
       Metadata describes data properties that may include the provenance, structure, content,
       and use of data points, datasets, documents, and document collections. Content-linked
       metadata typically includes author, production and modification dates, title, topic(s),
       keywords, format, language, encoding (e.g., character set), rights, and so on. The
       metadata label extends to specialized annotations such as part-of-speech and data type.
       Metadata may be created as part of content production or publication (for instance, the
       save date captured by a word-processor, a geotag associated with a social update, camera
       information stored in an image file). It may be appended (for instance via social tagging),
       or extracted from content via text/content analysis. Whether stored internally within a
       data object (for instance via RDFa, FOAF, or other microformats embedded in a Web page)
       or managed externally, in a database or search index, metadata is fuel for a range of
       applications.
                                                                        A Focus on Applications
       We will not devote further space in this report to discussion of text- and content-analysis
       technology. If you do want to learn more about text-analytics history and technology, do
       continue with the technology sections of Alta Plana’s 2009 study report, “Text Analytics
       2009: User Perspectives on Solutions and Providers,” available online at
       http://altaplana.com/TextAnalyticsPerspectives2009.pdf.
       As a bridge to survey-derived reporting of user perceptions of the text and content
       analytics market, solutions, and providers, we will look next at applications.




                                                                                                7
Text/Content Analytics 2011: User Perspectives



Applications and Markets
          Business users naturally focus on business benefits, whether of analytics or of any other
          technology or investment. Who are those users?
          Text and content analytics solutions have a place a) in any business domain, b) for any
          business function, and c) within any technology stack, that would benefit from automated
          text/content handling, that is, wherever text/content volume, velocity, and variety, and
          business urgency, are sufficient to justify costs. Consider a very telling quotation,
          however: Philip Russom of the Data Warehousing Institute wrote in a 2007 report, “BI
          Search and Text Analytics: New Additions to the BI Technology Stack,”2
                  “Organizations embracing text analytics all report having an epiphany moment
                   when they suddenly knew more than before.”
          In the analytics world, we see now that it is not enough to know more. You need to
          understand how to use knowledge gained, the processes and outcomes necessary to turn
          insights into ROI. Text and content analytics elements – information sources, insights
          sought, processes, and ROI measures – will vary by industry and application.
          In this report section, by way of lead-in to survey findings – applications, information
          sources, and ROI measures are the subject of survey questions 3, 4, and 5 – we look at
          text/content analytics adaptation for applications in several industries and for a variety of
          business functions.
                                                                                  Application modes
          Applications are diverse but may be classified in several (overlapping) groups. Our
          categorization is an update of 2009’s with social and online addition in particular:
                  Media, knowledgebase, and publishing systems – the author includes search
                   engines here – use text and content analytics to generate metadata and enrich
                   and index metadata and content in order to support content distribution and
                   retrieval. Semantic Web applications would fit in this category, as would
                   emerging information-access engines.
                  Content management systems – and, again, related search tools – use text
                   analytics to enhance the findability of content for business processes that include
                   compliance, e-discovery, and claims processing.
                  Line-of-business and supporting systems for functions such as compliance and
                   risk, customer experience management (CEM), customer support and service,
                   marketing and market research, human resources and recruiting… and newer
                   tasks that include social monitoring, measurement, and engagement.
                  Investigative and research systems for functions such as fraud, intelligence and
                   law enforcement, competitive intelligence, and science.
          Where are these applications used?
                                                                                   Business Domains
          Consider a sampling of industry domains where text and content analytics are frequently
          applied:
                  In intelligence and counter-terrorism, and in law enforcement, there is broad
                   content variety – languages, format (text, audio, images, and video), sources
                   (news, field reports, communications intercepts, government records, social

2
    http://www.teradata.com/assets/0/206/308/96d9065a-0240-44f1-b93c-17e08ae6eacc.pdf


                                                                                                    8
Text/Content Analytics 2011: User Perspectives


               postings) – and, at times, great urgency.
              In life sciences, for instance for pharmaceutical drug discovery, source materials
               have been more uniform (scientific literature, clinical reports) and there is no
               need for real-time response, yet information volumes are huge and complex and
               the potential payoff – years and millions of dollars shaved off lead-generation and
               clinical trials processes – to justify very significant investments in text mining.
              For financial services and insurance, effective credit, risk, fraud, and legal and
               regulatory-compliance decision-making involves creation of predictive models via
               analysis of large volumes of transactional records and often incorporates
               information mined from text sources such as financial and news reports, e-mail
               and corporate correspondence, insurance and warranty claims. Automated
               methods are essential.
              Market researchers rely on text analytics to hear and understand market voices.
               Focus groups are (on their way) out: They are costly, slow, and often unreliable.
               Surveys still have great value – beyond soliciting opinions, they can serve as an
               engagement tool – but neither they nor focus groups help researchers hear
               unprompted views, the attitudes that consumers express to their peers but not in
               more formal research settings. Why text analytics? Social is hot, yet human
               analysis, whether or surveys or of social postings, can be inconsistent and don’t
               scale. Add in text analytics and you have next-generation market research.
              As content delivery and consumption shift to digital, search and information-
               dissemination tools that exploit metadata (publisher-produced, analytically
               generated metadata, or socially tagged) are essential survival tools for media and
               publishing organizations. Content analytics creates better targeted, richer
               content and a much friendlier and more powerful experience for content
               consumers.
              Online and social have fomented an advertising revolution. Targeting is the word,
               whether based on behaviors (modeled via tracking and clickstream analysis) or on
               analytically computed matching. Matches may draw from user profiles, context
               (geography, accessing application, device or machine being used) and inferred
               intent (for instance from search terms), and the semantic-signatures of the
               content where ads are to be delivered.
              Text analytics provides essential capabilities in support of legal domain e-
               discovery mandates. Organizations must “produce” materials relevant to
               lawsuits, a task that would often be impossible without automated text
               processing, given huge volumes of electronically stored information generated in
               the course of business. Intellectual property is another legal-domain application.
               The task is to identify names, terminology, properties, and functions salient to an
               IP search that seeks to identify, for instance, prior art and possible patent
               infringement.
                                                                              Business Functions
       Many business tasks are independent of industry. Every organization of any significant
       size has in-house customer support, marketing, product development, and similar
       functions (even while definitions of customer, marketing, and product do still, of course,
       vary by industry.) Let’s examine the role text and content analytics play for the following:
              Customer experience management (CEM) is a signal text/content analytics
               success story. The aim is to transform customer relationship management (CRM),
               which captures transactions and interactions, into a set of tools and practices that
               cover the engagement span from customer acquisition to customer service and


                                                                                                9
Text/Content Analytics 2011: User Perspectives


                   support, first and foremost by listening and responding to the voice of the
                   customer across channels. In plain(er) English, CEM marries text- and speech-
                   sourced information – from e-mail, online forums, surveys, contact-center
                   conversations, and other touchpoints… and also from employee input – with
                   transactional and profile information. The hope is to improve customer
                   satisfaction and operate more efficiently and profitably. Simplistic, reductive
                   indicators such as the Net Promoter Score can only point at issues and challenges.
                   They can neither explain them nor suggest actions or remedies – insights that are
                   accessible (at enterprise scale) only via text and content analytics.
                  Marketers translate market-research and competitive-intelligence findings into
                   marketing campaigns and advertising and, in cooperation with product
                   developers, into higher quality, more satisfying products and services. It’s all
                   about listening.
                   Steve Rappaport, in his book Listen First!, says we should “Change the research
                   paradigm. Social media listening research should bring about an era of real-time
                   data that anticipates change and can be used to visualize and create a rewarding
                   business future,” as well as “rethink marketing, advertising, and media.” His
                   prescriptions about listening apply across channels and touchpoints, as they do
                   for CEM, with the difference here, for research-related functions, being that we
                   are looking at an aggregate rather than an individualized picture, seeking to hear
                   the voice of the market, again aided by text and content analytics. Our aim is to
                   deliver targeted, compelling advertising via more effective marketing and, of
                   course, superior products and services that better meet customer needs.
                  Competitive intelligence, in particular, involves mining customer voices, at both
                   individual and aggregate levels, and also business information, for instance about
                   sales, personnel, alliances, and market conditions that indicate opportunities and
                   threats. Ability to extract domain- and sector-focused information from online
                   and social sources and to integrate information from disparate sources in order to
                   derive coherent signals is essential, delivered by analytically rooted technologies.
                  Business intelligence (BI) was first defined, in the late 1950s, in terms of
                   extraction and reuse of knowledge drawn from textual sources.3 BI took off in a
                   different direction, however, starting in the late 1960s, centering on analysis of
                   numerical data captured in computerized corporate operational and transactional
                   systems. Back to the (1950s) Future: Number crunchers of all stripes recognize
                   the business value of information in text sources. They are seeking, with the help
                   of both major and niche BI and data warehousing vendors, to bring text-sources
                   information into enterprise BI initiatives. Call this integrated analytics, also
                   incorporating geospatial and machine-generated Big Data to bring businesses a
                   step closer to the sought-after (although mythical) 360o-view of the customer
                   (and the market and one’s own business).
                                                                                  Technology Domains
           Last, for context, let’s briefly consider technology domains where text and content
           analytics come into play, semantics and the Semantic Web, and then look at emerging text
           analytics applications.
           Semantic Computing
           First, redefining, text/content analytics involves the acquisition, processing, analysis, and
3
    Seth Grimes, “BI at 50 Turns Back to the Future,” InformationWeek, November 21, 2008:
    http://www.informationweek.com/news/software/bi/211900005


                                                                                                    10
Text/Content Analytics 2011: User Perspectives


           presentation of enterprise, online, and social information derived from text and rich-
           media sources. The technology is one route to semantics, to generating machine-usable
           identification of information objects attached to databases, tables, fields, and rows; to
           corpora, documents, and document content; and to media files, e-mail and text messages.
           Text/content analytics provides a descriptive route to semantics, making sense of
           information in-the-wild, as generated by humans (and machines) online, on social
           platforms, and in everyday business and personal communications, whether written,
           spoken, or captured in rich media. The alternative route to semantics is prescriptive,
           generated or captured in the course of content generation, whether via database export
           or a plug-in to an authoring application.
           The Semantics market includes technologies for the creation, management, and use of
           artifacts such as taxonomies, ontologies, thesauruses, gazetteers, semantic networks,
           controlled vocabularies, and metadata. These artifacts may be generated manually by
           subject-matter experts. They may be generated automatically by text analytics. And in
           many situations, a hybrid system involving manual curation of automatically generated
           artifacts may be in order.
           Semantics applications include digital content management, publishing, research, and
           librarianship across a broad set of industrial and government applications. The semantics
           market includes semantic search, whether open-domain, vertical (applied to a particular
           information domain), or horizontal (applied in a particular business function). It also
           includes classification and information integration.
           Classification, Search, and Integration
           Semantic computing finds its primary application in classification, search, and integration.
           Classification determines what a data item or object represents, including how it may be
           used, in relation to other data items and objects in a data space. This is, admittedly, an
           abstract and not particularly practical definition. Information integration and search are
           where semantics finds its most compelling applications. Semantic search is, in essence,
           “search made smarter, search that seeks to boost accuracy by taming ambiguity via an
           understanding of context.”4 Several approaches fit under the semantic search umbrella.
           They include related searches, search-results enrichment, concept searches, faceted
           search, and more. The common thread is better matching searcher intent (inferred from
           search context including past searches and the searcher’s profile) to searched-for
           information content. Semantic search is behind many emerging search-based
           applications, fueled by text and content analytics, for applications such as e-discovery,
           faceted navigation for online commerce, and search-driven business intelligence. And it is
           captured semantics, in the form of data identifiers and descriptions, that enables dynamic,
           adaptive information integration, where join paths are discovered based on business and
           application needs, not hard-wired as in until-recent computing generations.
           The Semantic Web
           The Semantic Web is, at its root, an information-integration and sharing application, a set
           of standards and protocols designed to facilitate creation and use of “Web of data.”
           Eventually, the Semantic Web market will include tools and services that execute
           knowledge-reliant business transactions over distributed, semantically infused data
           spaces. We are years from that market.
           The bulk of Semantic Web focused expenditures are for government funded research

4
    Seth Grimes, “Breakthrough Analysis: Two + Nine Types of Semantic Search,” InformationWeek, January
    21, 2010: http://www.informationweek.com/news/software/bi/222400100


                                                                                                     11
Text/Content Analytics 2011: User Perspectives


       projects at universities and similar institutions. Outside research contexts, business
       implementations do not extend significantly beyond a) the use of microformats and RDFa
       (Resource Description Framework–attributes) to allow Web-published structured data to
       be indexed by search engines to facilitate information access and b) the use of RDF triples
       as a convenient format for structuring facts for storage in DBMSes supporting graph-
       database schemas to facilitate integration and query of data from disparate sources.
       At a certain point however, perhaps in 2-4 years, the Semantic Web will reach a tipping
       point where its business value, and the revenues generated by technology and solutions
       sales, licensing, and support, will explode.
       Value Today
       At this time, text/content analytics delivers business value that is greater by far than the
       value delivered by related semantic and Semantic Web technologies. This is because the
       vast majority of subject information – text, images, audio, and video (a.k.a. content) – is in
       “unstructured” form, just a string of bytes (and terms, in the case of text) so far as
       software systems – Web browsers and office productivity tools, content management
       systems, search engines – are concerned.
       To make content tractable for business ends, for operational or analytical purposes or in
       order to monetize content as a product, one must first create structure. To maximize
       content usability, for most social and for many enterprise sources, generated structure
       will take into account semantic information extracted from source materials. That is,
       structure shouldn’t be arbitrary, a matter of sticking information into a set of round
       pigeonholes for square-peg content.
       This process of the discovery, extraction, and use of semantic information in content is the
       domain of text/content analytics solutions.
                                                                                Solution Providers
       The aggregate characteristics of the text and content analytics solution-provider spectrum
       are little changed since 2009 although there has been significant turn-over in players. We
       still have, as reported in 2009, “a significant cadre of young pure-play software vendors,
       software giants that have built or acquired text technologies, robust open-source projects,
       and a constant stream of start-ups, many of which focus on market niches or specialized
       capabilities such as sentiment analysis.”
       The big change is in delivery mode. The market now favors as-a-service analytics, whether
       in the form of online applications, cloud provisioned, or provided via Web application
       programming interfaces (APIs). This shift makes sense.
              The most in-demand new information sources are online, social, and on-cloud.
              Use of as-a-service, cloud, and via-API applications means low up-front
               investment, faster time to use, and pay-as-you-go pricing without IT involvement.
              Certain providers offer as-a-service access to both historical and current data at
               attractive costs given the buy-once, sell-many-times economies they enjoy.
              Modern applications are designed to draw data via APIs, facilitating application-
               inclusion of plug-in text and content analytics capabilities.
       There is every expectation that the solution-provider market will continue to evolve to
       keep pace with user needs and broad-market business and technical trends.




                                                                                                12
Text/Content Analytics 2011: User Perspectives



Demand-Side Perspectives
        Alta Plana designed a 2011 survey, “Text/Content Analytics demand-side perspectives:
        users, prospects, and the market,” to collect raw material for an exploration of key text-
        analytics market-shaping questions:
               What do customers, prospects, and users think of the technology, solutions, and
                vendors?
               What works, and what needs work?
               How can solution providers better serve the market?
               Will your companies expand their use of text analytics in the coming year? Will
                spending on text/content analytics grow, decrease, or remain the same?
        It is clear that current and prospective text/content-analytics users wish to learn how
        others are using the technology, and solution providers of course need demand-side data
        to improve their products, services, and market positioning, to boost sales and better
        satisfy customers. The Alta Plana study therefore has two goals:
               To raise market awareness and educate current and prospective users.
               To collect information of value to solution providers, both study sponsors and
                non-sponsors.
        Survey findings, as presented and analyzed in this study report, provide a form of measure
        of the state of the market, a form of benchmark. They are designed to be of use to
        everyone who is interested in the commercial text/content-analytics market.
                                                                                    Study Context
        The author previously explored market questions in a number of papers and articles.
        These included white papers created for the Text Analytics Summit in 2005, The
        Developing Text Mining Market,”5 and 2007, “What's Next for Text.”6
        A systematic look at the demand side provides a good complement to provider-side views
        and to vendor- and analyst-published case studies, including the author’s own. This
        understanding motivated the 2009 study, “Text Analytics 2009: User Perspectives on
        Solutions and Providers,” available for free download.7
        That research was preceded by Alta Plana’s 2008 study report, “Voice of the Customer:
        Text Analytics for the Responsive Enterprise,”8 published by BeyeNETWORK.com, a first
        systematic survey of demand-side perspectives, albeit focused on a particular set of
        business problems. VoC analysis is frequently applied to enhance customer support and
        satisfaction initiatives, in support of marketing, product and service quality, brand and
        reputation management, and other enterprise feedback initiatives.
                                                                                 About the Survey
        There were 224 responses to the 2011 survey, which ran from June 6 to July 9, 2011.
        (Contrast with 116 responses to the 2009 survey, which ran from April 13 to May 10,
        2009.)




5
  http://altaplana.com/TheDevelopingTextMiningMarket.pdf
6
  http://altaplana.com/WhatsNextForText.pdf
7
  http://altaplana.com/TA2009
8
  http://altaplana.com/BIN-VOCTextAnalyticsReport.pdf


                                                                                                 13
Text/Content Analytics 2011: User Perspectives


       Survey invitations
       The author solicited responses via
                E-mail to the TextAnalytics, SentimentAI, Corpora, Lotico, BioNLP, Information-
                 Knowledge-Content-Management, and ContentStrategy lists and the author’s
                 personal list.
                Invitations published in electronic newsletters: InformationWeek, BeyeNETWORK,
                 CMSWire, KDnuggets, AnalyticBridge, and Text Analytics Summit.
                Notices posted to LinkedIn forums and Facebook groups and on Twitter.
                Messages sent by sponsors to their communities.
       Survey introduction
       The survey started with a definition and brief description as follow:
           Text Analytics / Content Analytics is the use of computer software or
           services to automate
               • annotation and information extraction from text – entities, concepts,
                 topics, facts, and attitudes,
               • analysis of annotated/extracted information,
               • document processing – retrieval, categorization, and classification,
                 and
               • derivation of business insight from textual sources.
           This is a survey of demand-side perceptions of text technologies,
           solutions, and providers. Please respond only if you are a user, prospect,
           integrator, or consultant. There are 21 questions. The survey should take
           you 5-10 minutes to complete.
           For this survey, text mining, text data mining, content analytics, and text
           analytics are all synonymous.
           I'll be preparing a free report with my findings. Thanks for participating!
           Seth Grimes (grimes@altaplana.com, +1 301-270-0795)
       The introduction ended with the text:
           Privacy statement: This survey records your IP address, which we will use
           only in an effort to detect bogus responses. It is your choice whether to
           provide your name, company, and contact information. That information
           will not be shared with sponsors without your permission, and if shared
           with sponsors, it will not be linked to your survey responses.




                                                                                            14
Text/Content Analytics 2011: User Perspectives


       Survey response
       There is little question that the survey results overweight current text-analytics users –
       73% of respondents who answered Q1, “How long have you been using Text Analytics?”
       (n=224) versus 78% of respondents who replied to Q7, “Are you currently using
       text/content analytics?” (n=206) – among the broad set of potential business,
       government, and academic users. (The difference in percentage is likely due to a higher
       rate of survey abandonment among non-users. The figures contrast with 63% and 61% in
       the 2009 survey.) So call this a Pac Man question, one whose response indicates very
       significant survey selection bias:

                      Are you currently using text/content analytics?


                                                                          Yes
                                                                          No
                                                           21.8%
                                   78.2%                                (n=206)




                                                        Market Size and the Larger BI Market
       We can infer overweighting by comparing market-size figures. The author estimates an
       $835 million 2010 global market for text/content-analytics software and vendor supplied
       support and services. As the author described in the May 12, 2011 InformationWeek
       article Text-Analytics Demand Approaches $1 Billion9,
             “My $835 million market-size estimate covers software licenses, service
              subscriptions, and vendor-provided technical support and professional
              services. Despite strong growth, it remains a small fraction of Gartner's
              $10.5 billion 2010 valuation of the broader BI, analytics, and performance-
              management software market.”10
       By contrast, the 2009 text-analytics market report cited the author’s figure of $350 million
       for the global, 2008 text analytics market. (That figure did not account for search-based
       applications, which were included in the 2010 market-size estimate.) The 2009 report
       also cited a 2008 BI-market estimate from research firm IDC: “The business intelligence
       tools software market grew 6.4% in 2008 to reach $7.5 billion.”11




9
  http://www.informationweek.com/news/software/bi/229500096
10
   http://www.gartner.com/it/page.jsp?id=1642714
11
   http://www.idc.com/getdoc.jsp?containerId=217443


                                                                                              15
Text/Content Analytics 2011: User Perspectives


                                                                        The Data Mining Community
           Another contrasting data point is that 65% of respondents to a July 2011 KDnuggets poll12
           report (n=121) using text analytics on projects in the preceding year. Results were tallied
           nine days into the poll, before it was closed, so final numbers may differ from those
           reported here.
           The figure in a similar, March 2009 poll was 55% currently using text analytics/text mining.

           KDnuggets: How much did you use text analytics / text mining in the
           past 12 months?

                  Used on over 50% of my projects                             21.5%

                    Used on 26-50% of my projects              9.9%

                       Used on 10-25% of projects                     14.9%

                     Used on < 10% of my projects                        19.0%

                                      Did not use                                        34.7%

                                                    0%   5% 10% 15% 20% 25% 30% 35% 40%


           KDnuggets reaches data miners, a technically sophisticated audience who are among the
           most likely of any market segment to have embraced text analytics. The rate of text-
           analytics adoption by data miners surely exceeds the rate adoption by any other user
           sector.
           As an aside, 49% of KDnuggets respondents stated that in comparison to the last 12
           months, in the next 12 they would use text analytics more, whether on additional projects
           or more intensively on a steady project workload. 43% stated their use would remain
           about the same and only 8% anticipated less use.




12
     http://www.kdnuggets.com/2011/07/poll-text-analytics-use.html


                                                                                                  16
Text/Content Analytics 2011: User Perspectives



Demand-Side Study 2011: Response
       The subsections that follow tabulate and chart survey responses, which are presented
       without unnecessary elaboration.

       Q1: Length of Experience
       As in 2009, the 2011 survey opened with a basic question –


             How long have you been using Text/Content Analytics?
           35%

           30%

           25%

           20%

           15%

           10%

            5%

            0%
                     not using,                               6 months to one year to two years to
                                    currently   less than 6                                          four years
                    no definite                                 less than   less than   less than
                                   evaluating     months                                              or more
                    plans to use                                one year   two years   four years
     2009 (n=107)      16%           22%           8%             5%          7%          18%          25%
     2011 (n=224)       6%           21%           3%             5%         12%          20%          33%

       We see that 2011 responses skew to longer experience than measured in 2009. Survey
       results were not based on a scientifically designed or measured population sample
       however, neither in 2011 nor in 2009, and given how out of proportion survey-measured
       experience is to that of the broad business population – the addressable market for
       text/content analytics likely extends far beyond the currently user base – the most
       plausible conclusion one can draw from Q1 responses is that 2011 survey outreach failed
       to bring in the proportion of new and prospective users reached in 2009. Nonetheless, Q1
       responses will prove illuminating in analyses of subsequent survey questions, in studying
       how attitudes vary by length of text/content analytics experience.




                                                                                                              17
Text/Content Analytics 2011: User Perspectives


              Q2: Application Areas

             What are your primary applications where text comes into play?
                                                                                                                           39%
               Brand/product/reputation management
                                                                                                                            40%
          Voice of the Customer / Customer Experience                                                                      39%
                          Management                                                                           33%

                                                                                                                           39%
     Search, information access, or Question Answering

                                                                                                                     36%
                                   Research (not listed)
                                                                                                               33%

                                                                                                               33%
                               Competitive intelligence
                                                                                                                     37%

                                                                                                         26%
                                Customer service/CRM
                                                                                                   22%
 Product/service design, quality assurance, or warranty                              15%
                         claims                                                    14%

                                                                                    15%
                       Life sciences or clinical medicine
                                                                                       18%
                                                                                                     2011 (n=219)
                                                                                     15%
                                             E-discovery                                             2009 (n=103)
                                                                                    15%
Online commerce including shopping, price intelligence,                      11%
                      reviews
                                                                            10%
                      Financial services/capital markets
                                                                                    15%

                                                                            9%
                                                  Other
                                                                                 13%

                                                                        8%
                  Insurance, risk management, or fraud
                                                                                       17%

                                                                        8%
                    Content management or publishing
                                                                                             19%

                                                                       7%
                 Military/national security/intelligence

                                                                      6%
                                      Law enforcement
                                                                       7%


                                                            0%   5%    10% 15% 20% 25% 30% 35% 40% 45%




              The 219 respondents in 2011 chose a total of 748 primary applications, an average of 3.4
              primary applications per respondent. While there is some category overlap, it is notable
              that respondents are applying text analytics toward multiple business needs.




                                                                                                                       18
Text/Content Analytics 2011: User Perspectives


               Q3: Information Sources

           What textual information are you analyzing or do you plan to analyze?

                                                                                                                                 62%
                           blogs and other social media                                                              47%

                                                                                                                 41%
                                            news articles                                                          44%

                                                                                                          35%
                                          on-line forums                                                  35%

                                                                                                           35%
                              customer/market surveys                                                     34%

                                                                                                    30%
                                  review sites or forums                                 21%

                                                                                                   29%
                             e-mail and correspondence                                                     36%

                                                                                                27%
                         scientific or technical literature                                     27%

                                                                                           23%
                     contact-center notes or transcripts                                     25%

                                                                                          22%
                                      Web-site feedback                                  21%

                                                                                         21%
                                text messages/SMS/chat               8%

                                                                                 15%                     2011 (n=215)
                                       employee surveys                           16%

                                                                                14%                      2009 (n=100)
                               field/intelligence reports

                                                                              14%
                                  speech or other audio

                                                                             12%
crime, legal, or judicial reports or evidentiary materials                    13%

                                                                          10%
                                         medical records                           16%

                                                                        9%
                    point-of-service notes or transcripts                    12%

                                                                        9%
                                         patent/IP filings                11%

                                                                     8%
                photographs or other graphical images

                                                                    7%
                insurance claims or underwriting notes                          15%

                                                                   6%
                              video or animated images

                                                                   5%
                       warranty claims/documentation                 7%


                                                              0%   10%             20%          30%        40%      50%    60%         70%



                                                                                                                           19
Text/Content Analytics 2011: User Perspectives


       The 215 respondents in 2011 chose a total of 962 textual-information sources, an average
       of 4.5 sources per respondent. The big news is not news at all: Social sources are by far
       the most popular and 4 of the top 5 categories are social/online (as opposed to in-
       enterprise) sources. Despite social’s status, however, it is a source for barely more than 6
       out of 10 respondents.




                                                                                               20
Text/Content Analytics 2011: User Perspectives


          Q4: Return on Investment
          Question 4 asked, “How do you measure ROI, Return on Investment? Have you achieved
          positive ROI yet?” There were 164 respondents. Results are charted from highest to
          lowest values of the sum of “currently measure” and “plan to measure”:


                  How do you measure ROI, Return on Investment?
                       Measure: Achieved         Measure: Not Achieved                 Plan to Measure


                             higher satisfaction ratings             19%               18%                  28%

                  increased sales to existing customers         13%              18%                  29%

            ability to create new information products          11%         13%                27%

                  improved new-customer acquisition             9%         15%                25%

               higher customer retention/lower churn            10%        12%            23%

    higher search ranking, Web traffic, or ad response          10%        12%           22%

   reduction in required staff/higher staff productivity        9%     9%               23%

      fewer issues reported and/or service complaints           9%    6%           23%

 lower average cost of sales, new & existing customers      5% 7%                 23%

        faster processing of claims/requests/casework           10%    6%          19%

more accurate processing of claims/requests/casework        6% 7%                20%

                                                           0%        10%     20%        30%     40%      50%      60%   70%


          Out of 164 respondents, 37.8% (62), report that they have achieved positive ROI according
          to some measure. Those 62 respondents reported achieving ROI according to a total of
          182 measures, that is, 2.94 ROI-achieved measures for each respondent who achieved
          positive ROI.
          Out of 164 respondents, 50 are measuring ROI but have not yet achieved positive ROI
          according to any measure.
          The 112 respondents who are measuring ROI (whether achieved or not) track a total of
          385 measures among them, 3.44 measures per respondent.
          The following are several of the Other responses given:
                  Better customer insight, market intelligence, and competitive intelligence.
                  Content findability.
                  Creation of scientific knowledge.
                  Higher employee engagement and better L&D outcomes.
                  Improvement in existing processes, turnover time.



                                                                                                                   21
Text/Content Analytics 2011: User Perspectives


              Incremental sales lift.
              Lowered cost of fraud, more accurate predictive analytics.
              Number of action executives can take, estimated dollar savings from risk
               correction/avoidance.
              Patient outcomes.
              Providing better data to scholars.
              Reduction of Claim Cost.
              Stronger understanding of subconscious emotional zones.
              We don´t know how to measure it properly.

       Q5: Mindshare
       A word cloud, generated at Wordle.net, seemed a good way to present responses to the
       query, “Please enter the names of companies that you know provide text/content
       analytics functionality, separated by commas. List up to the first 8 that come to mind.”
       There were 129 responses, many offering several companies. A bit of data cleansing was
       done, to regularize names and remove inappropriate responses.




       Contrast with the 2009 word cloud (deliberately rendered smaller than the 2011 cloud,
       without an attempt to create sizing consistent between the two clouds) based on 48
       response records, as follows:




       Note that IBM acquired SPSS in mid-2009.




                                                                                            22
Text/Content Analytics 2011: User Perspectives


        Q6: Spending
        Question 6 asked about 2010 spending and 2011 expected spending.

             How much did your organization spend in 2010, and how
              much do you expect to spend in 2011, on text/content
                     analytics software/service solutions?
                             90%

                             80%
                                                                                7%
                             70%
                                                                       3%
                                                 6%                             6%
                             60%                        2%                      7%
                                                 4%
                                                 7%                             7%
                             50%

    $1 million or above                          9%
                             40%
    $500,000 to under $1 million
                            30%
    $200,000 to $499,999
                                                                               30%
    $100,000 to $199,999     20%               23%
    $50,000 to $99,000
                             10%
    under $50,000
                                               15%                             19%
    use open source          0%
                                        2010 spent (n=176)            2011 expected (n=165)
      $1 million or above                        6%                             7%
      $500,000 to under $1 million               2%                             3%
      $200,000 to $499,999                       4%                             6%
      $100,000 to $199,999                       7%                             7%
      $50,000 to $99,000                         9%                             7%
      under $50,000                            23%                             30%
      use open source                          15%                             19%


        Questions asked of only current text/content-analytics users.
        Questions 8 through 13 were posed exclusively to current text/content analytics users, to
        the 81.2% of the 206 respondents to Q7: Are you currently using text/content analytics?

        Q8: Satisfaction
        Question 8 asked, “Please rate your overall experience – your satisfaction – with text
        analytics.” It offered five categories, listed here with response counts:
               Overall experience/satisfaction (n=117, of whom 3 No experience/No opinion).
               Ability to solve business problems (n=114, 12 NE/NO).
               Solution/technology ease of use (n=112, 5 NE/NO).
               Solution/technology performance (n=114, 4 NE/NO).



                                                                                                 23
Text/Content Analytics 2011: User Perspectives


                Availability of professional services/support (n=112, 13 NE/NO).
       Responses, which across categories are somewhat anomalous, are as shown:


                Please rate your overall experience – your satisfaction – with
                                    text/content analytics
         100%          3%           3%            4%           4%            4%
                       4%
                                    7%
          90%                                                  13%
                                                                            17%
                                                 21%
                      24%
          80%
                                    31%                                             Very disappointed
          70%                                                                       Disappointed
                                                               36%
                                                                                    Neutral
          60%                                                               36%
                                                 38%                                Satisfied
          50%                                                                       Completely satisfied


          40%         58%
                                    42%

          30%                                                  35%
                                                                            31%
                                                 27%
          20%

          10%                       17%
                      12%                                      12%          11%
                                                  9%
           0%




       Overall, 70% of current-users respondents who had an opinion reported themselves
       Satisfied/Completely Satisfied even while the breakout-category counts totaled 59%, 36%,
       47%, and 42% Satisfied/Completely Satisfied. We can surmise that the numbers who
       voiced “No experience/No opinion” for the breakout categories tended to have a
       favorable overall experience.




                                                                                           24
Text/Content Analytics 2011: User Perspectives



                      Experience/satisfaction sentiment polarity
                                                                   Positive
                                     Overall experience /
                                         satisfaction              Neutral
                                        80%
                                                                   Negative
                                         60%
               Availability of           40%                  Ability to solve
           professional services /       20%                 business problems
                  support
                                         0%



                 Solution / technology                  Solution / technology
                     performance                             ease of use




       Q9: Overall Experience
       Question 9 asked, “Please describe your overall experience – your satisfaction – with text
       analytics.” The following are 49 from among the 63 responses, categorized, lightly edited
       for spelling and grammar and with the names of three products masked:
                                                 Happy
        It works.

        Excellent.

        Absolutely essential.

        Very satisfied, most goals exceeded, big jump in effectiveness and customer
        satisfaction.

        Pretty happy given we are in a highly technical different to monitor/track niche.

        Saving a lot of time for our journalists.

        We have found having an application with the capabilities to clean and normalize
        the text and quantitative data, process it to a form to analyze, and run text mining
        and categorization on an ad hoc or production basis has greatly enhanced my
        team's capabilities and productivity.

        We found great value from using a Speech Analytics solution to retain customers
        and improve the overall customer experience through root-cause analysis.

        I have been working with text analytics for academic and scientific purposes and I
        am quite satisfied with results achieved.

        I work with nurse and social science researchers. They think that a chat with 20
        people is research. I tend to analyze hundreds or thousands of free-text comments.



                                                                                               25
Text/Content Analytics 2011: User Perspectives


        I use software to overcome the biases inherent in manual analysis.

                                            It Takes Work
        Very powerful tool but requires the organization's ability to take action on the
        insights.

        Valuable tool; my clients are content to underutilize it, so what is available more
        than meets our needs.

        Since we use open source, the ROI is basically how much time you put into the
        solution and how many problems it solves. We have been successful so far.

        Very Satisfied but extremely labor intensive

        We provide this as a tool to our clients in our application for publishing press
        releases. It works fine but could be better but that is up to us to implement it fully.

        Once you spend man hours to set up the tool, it is extremely consistent on doing
        what you tell it to do. I know improvements are coming but I'd like more AI from
        text analytics tools than what is currently offered.

        Do-It-Yourself is challenging but not impossible. Very cheap to operate.

        Fairly satisfied – problem is I am sole researcher and data/text clean-up takes too
        much time given other demands.

        I've been a user and vendor of text analytics (in fact, in my early <...> days, we
        helped coin the phrase “text analytics”). Vendors generally overpromise and have
        difficulty delivering. Both vendors and customers underestimate the amount of
        resources required to get it right. So, still hard to use for mainstream purposes.
                                  Reservations and complications
        Steep learning curve.

        I am currently satisfied, but I believe we (as analysts) are just beginning to fully
        unlock the full potential of text analytics.

        On one hand, I'm amazed and thrilled that this stuff exists at all. But on the other
        hand, I haven't seen anything that does just what I want it to do.

        It's opened up opportunities to analyze unstructured data but not at the same level
        as structured data.

        Works well at highest level of analysis (e.g. sentiment) but not as well in auto-
        coding for custom (i.e. project) studies.

        Tools are good, but lack transparency, ability to explain how conclusions are
        reached.

        There is still a lot of work required to optimize this technology since it can currently
        provide concepts but does not capture context and it’s a lot of slow painful work to
        get the software to recognize context in which something is mentioned and


                                                                                                  26
Text/Content Analytics 2011: User Perspectives


        accuracy is still not a lot.

                                             Unmet needs
        Very promising technology but some difficulties to
          - Implement smoothly text mining component into existing information system.
          - Cope with various languages, formats, volumes, etc. of data.
          - Measure and demonstrate tangible results in terms of improved information
            extraction quality.
          - Assess ROI (reducing processing time / saving resources for core tasks e.g.
            analysis).
        Powerful but overly difficult, impenetrable - technology vs. solutions.

        An emerging and enabling technology in our business with broad applicability.
        Satisfied in our applications with accuracy and precision but hitherto disappointed
        with export capability to other applications.

        Still a volatile market for applications beyond VOC/sentiment analysis. Vendors are
        eager to please but sometimes overstate the capabilities. However, I still have
        limited experience in solving real business problems with these tools (I am a
        consultant).

        I think this field is in its infancy. Lots of issues with data quality. Sentiment
        analytics often flawed. Hard to scale or automate.

        The handful of companies and solutions I came across do not seem to marry or
        integrate structured and unstructured text easily... Algorithms are not quite
        available as a function or way to improve accuracy.

        I feel there is so much more work to be done both on the analysis side and also on
        the business implementation side. While I work heavily in this area, I won't be
        more satisfied until I see better end-to-end integration and until I see more
        effective and systematic use of insights.

        I do everything myself. The lack of good lexical resources and taxonomies is a real
        problem that drives up the cost (in manpower) of providing a solution. And the
        complexity of the infrastructure required vs. the apparent simplicity of the
        problem (in managers' minds) makes it very difficult to adjust expectations.

        We use <...> and we have to write our own routines to find the text and content
        that we are interested in. There are plenty of functions that help us with our goals
        but obviously there is still much that we need to do to higher recall and accuracy.

        <...> is the only tool which is both open source and professionally useful. However
        in spite of 20 years of development, it still has a very poor user interface as well as
        API interface which hinder productivity and acceptance at a beginner's level.
                                              Skepticism
        Jury is still out.

        It’s still evolving, accuracy of results something to watch for in iterations.



                                                                                                  27
Text/Content Analytics 2011: User Perspectives



        Still learning.

        Very early days!

        Promising but still very difficult to see quick results. Everything seems to take ages
        and it’s been a painful learning curve.

        Hard to trust the automated results when you've been used to achieving 100% with
        manual human analysis.

        Still too new.

        Field as a whole is underperforming what is possible.

        Though the concept is very appealing, it is still in its native stages, and a lot more
        possibilities are left to be explored. IBM Watson is a good step ahead in that
        direction.

        Very poor, almost useless.
                                            Looking ahead
        On the whole, very satisfied with the range of solutions available and their ease of
        use. Very much looking forward to watching the technology progress – it's
        obviously not perfect yet.

        Unlike structured data, getting value out of text analytics tools require
        understanding of text elements – how to utilize occurrence of different parts of
        speech, how to interpret different types of sentences like requests, commands,
        opinionated sentences, etc. Domain knowledge and tunable and adaptable
        systems are a must for success. Non-availability of trained personnel to provide
        text mining services leads to dissatisfaction of users. Business end users do not like
        to use the tools themselves because of the complexity. The process or strategy for
        text mining needs to be established.

        We're pretty happy with text analytics and see it as a transformational technology.
        Most of text analytics' problems lie in how it is sold. It is both broad and deep and
        has a myriad of tools best suited for very different use cases, but customers think
        "text analytics is text analytics." Really, “text analytics” is a horrible term that
        needs to be broken up into component parts.


       Q10: Providers
       Question 10 asked, “Who is your provider? Enter one or more, separated by commas,
       most important provider first.” There were 77 response records, listing providers (sorted
       and without counts):
             Autonomy, Clarabridge, Colbenson, Content Analyst, Expert System,
             GATE, IBM, in-house, Lexalytics, LingPipe, Megaputer, MotiveQuest,
             open source , Open Text, R, Radian6, Rapid-I, Saplo, SAS, Smartlogic,
             Sysomos, TEMIS, Teradata, TextKernel, Thomson Reuters (including
             Calais, ClearForest), Verint, Zemanta
       Note that the survey asked, “Please respond only if you are a user, prospect, integrator, or


                                                                                                 28
Text/Content Analytics 2011: User Perspectives


       consultant.”

       Q11: Provider Selection
       Question 11 asked, “How did you identify and choose your provider? (If more than one,
       limit response to your most important provider.)”

        Applicability, robust performance, open source.

        Research.

        Experience and luck.

        Very satisfied reference customers with similar applications, most flexible
        solutions, expertise of consultants, high quality of service, extreme agility, and
        extremely rapid idea-to-deployment cycles.

        They contacted us before launch of their first product.

        Product evaluation in context of business application.

        Based on business requirements in the framework of a European competitive
        tender procedure.

        Advised by a related Web development consultant.

        We spent about a year evaluating and classifying vendors that in part or whole
        would fill our needs as expressed in Q9. We decided on using an application with
        integrated quantitative and qualitative analytic capabilities as the best
        possibilities. We ended up doing POC's with SAS, SPSS and Megaputer, and ended
        up choosing the later.

        We evaluated multiple providers based on (1) tool flexibility – can we customize?
        (2) accuracy (3) type of content it can tag (4) sentiment methodology (5) price.

        Main criteria are cost, multi-language capability, and integration with SAS.

        Competitive bids.

        Large existing analytics relationship: Tool was an add on.

        Conducted a thorough investigation of leading providers in the space.

        Quality and reviews.

        Personal recommendations.

        Constructed a needs analysis ranking system. Our needs included ease of
        integration, tools, ability to produce meaningful results at sub-document (short
        document) level, ease of (or no) training.

        Networking and academic partnerships.




                                                                                             29
Text/Content Analytics 2011: User Perspectives



        Proof of Concept – evaluated about a dozen or so vendors – have not selected a TM
        vendor as yet.

        Recommended by a trusted source.

        Based on recommendations <...> and our own search / lab testing, which brought
        us to <...>.

        Introduction from my manager.

        What my client uses.

        It was an obvious choice since there was no real alternative on the market (i.e.
        language is limiting the products).

        We compared a number of providers and decided to go for <...> that have a local
        presence and are experts on the Swedish language.

        Free, for research purposes.

        Trials based on performance.

        Price/performance tradeoff and applicability to targeted business problem.

        I work for the company. Use other languages (Perl) as necessary.

        Tested various services, rated results.

        I choose the vendor or tools based upon my client application needs.

        We do not have a primary provider ... we maintain a library of tools and use many
        of them in the same project.

        Trying all major ones.

        Reputation, personal contacts.

        Established open-source project.

        Market research, pricing, case studies and product evaluations.

        It was recommended to us.

        Working in-house.

        I worked for one of them and selected the other on their open source commitment.

        Proof of Concept.

        Support for Drupal.

        Cost, applicability to needs.



                                                                                            30
Text/Content Analytics 2011: User Perspectives



        I don't, that's up to my clients. But my advice to them is to begin with an
        understanding of the goals, and work backward to identify the provider.

        Company demo.

        Recommendation from experts, and tried and tested different ones.

        Management mandate, client.

        RFP to replace an existing legacy system.

        We already used <...> and had everything we needed to do Proof of Concept;
        waiting for business reason to acquire <...>.

        Advanced, scalable LSI technology.

        Work for them.

        Ability to mine audio, text, and customer surveys.


       Q13: Promoter?
       Question 13 is new with the 2011 survey; we did not ask it in 2009. It is a basic net-
       promoter type question, without the “net” part: “How likely are you to recommend your
       most important provider to others who are looking for a text/content analytics solution?”
       Of 87 responses, 49% were positive, 23% were neutral, and 28% were negative.


                   How likely are you to recommend your most important
                                          provider?
                                                                   Extremely likely to
                                                                   recommend against

                                           15%                     Moderately likely to
                                                                   recommend against

                    34%
                                                                   Slightly likely to recommend
                                                       6%
                                                                   against
                                                                   Neither likely to recommend
                                                        7%
                                                                   nor recommend against
                                                                   Slightly likely to recommend


                                                                   Moderately likely to
                          10%                    23%
                                                                   recommend
                                5%
                                                                   Extremely likely to
                                                                   recommend


       Promoters outweigh detractors by a net of 21.




                                                                                                  31
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers
Text/Content Analytics 2011: User Perspectives on Solutions and Providers

More Related Content

What's hot

A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...Michael Mortenson
 
Trends, Tools and Tips for Technology Careers
Trends, Tools and Tips for Technology CareersTrends, Tools and Tips for Technology Careers
Trends, Tools and Tips for Technology CareersMichael Mortenson
 
Text Analysis in Research
Text Analysis in ResearchText Analysis in Research
Text Analysis in ResearchBytesview
 
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising  Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising IJECEIAES
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data AnalyticsVadivelM9
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights Joe Lamantia
 
Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterThe Marketing Distillery
 
Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Leigh Ulpen
 
Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute PoojaPatidar11
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introductionkrishna singh
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
 
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...ijtsrd
 
Empirical discovery concept model
Empirical discovery concept modelEmpirical discovery concept model
Empirical discovery concept modelJoe Lamantia
 

What's hot (19)

Data analytics
Data analyticsData analytics
Data analytics
 
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data analytics
Data analyticsData analytics
Data analytics
 
Trends, Tools and Tips for Technology Careers
Trends, Tools and Tips for Technology CareersTrends, Tools and Tips for Technology Careers
Trends, Tools and Tips for Technology Careers
 
Analytics 2
Analytics 2Analytics 2
Analytics 2
 
Text Analysis in Research
Text Analysis in ResearchText Analysis in Research
Text Analysis in Research
 
Data analytics
Data analyticsData analytics
Data analytics
 
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising  Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising
Streaming Big Data Analysis for Real-Time Sentiment based Targeted Advertising
 
Introduction to Business Data Analytics
Introduction to Business Data AnalyticsIntroduction to Business Data Analytics
Introduction to Business Data Analytics
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotter
 
Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Self-service analytics risk_September_2016
Self-service analytics risk_September_2016
 
Data analytics presentation- Management career institute
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
 
Unit2
Unit2Unit2
Unit2
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7
 
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...Detailed Investigation of Text Classification and Clustering of Twitter Data ...
Detailed Investigation of Text Classification and Clustering of Twitter Data ...
 
Empirical discovery concept model
Empirical discovery concept modelEmpirical discovery concept model
Empirical discovery concept model
 

Viewers also liked

Sentiment Analysis: The Marketplace and Providers
Sentiment Analysis: The Marketplace and ProvidersSentiment Analysis: The Marketplace and Providers
Sentiment Analysis: The Marketplace and ProvidersSeth Grimes
 
The Insight Value of Social Sentiment
The Insight Value of Social SentimentThe Insight Value of Social Sentiment
The Insight Value of Social SentimentSeth Grimes
 
Text Analytics Today
Text Analytics TodayText Analytics Today
Text Analytics TodaySeth Grimes
 
Text Analytics 2009: User Perspectives on Solutions and Providers
Text Analytics 2009: User Perspectives on Solutions and ProvidersText Analytics 2009: User Perspectives on Solutions and Providers
Text Analytics 2009: User Perspectives on Solutions and ProvidersSeth Grimes
 
The State of Semantics
The State of SemanticsThe State of Semantics
The State of SemanticsSeth Grimes
 
Technology Frontiers: Text, Sentiment, and Sense
Technology Frontiers: Text, Sentiment, and SenseTechnology Frontiers: Text, Sentiment, and Sense
Technology Frontiers: Text, Sentiment, and SenseSeth Grimes
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart BusinessSeth Grimes
 
Social Media AND THE Enterprise Business Intelligence/Analytics Connection
Social Media AND THE Enterprise Business Intelligence/Analytics ConnectionSocial Media AND THE Enterprise Business Intelligence/Analytics Connection
Social Media AND THE Enterprise Business Intelligence/Analytics ConnectionSeth Grimes
 
Text, Content, and Social Analytics: BI for the New World
Text, Content, and Social Analytics: BI for the New WorldText, Content, and Social Analytics: BI for the New World
Text, Content, and Social Analytics: BI for the New WorldSeth Grimes
 
Search, Signals & Sense: An Analytics Fueled Vision
Search, Signals & Sense: An Analytics Fueled VisionSearch, Signals & Sense: An Analytics Fueled Vision
Search, Signals & Sense: An Analytics Fueled VisionSeth Grimes
 
Social Data Sentiment Analysis
Social Data Sentiment AnalysisSocial Data Sentiment Analysis
Social Data Sentiment AnalysisSeth Grimes
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
An Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationAn Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationSeth Grimes
 
Design of multichannel attribution model using click stream data
Design of multichannel attribution model using click stream dataDesign of multichannel attribution model using click stream data
Design of multichannel attribution model using click stream dataLucie Šperková
 

Viewers also liked (14)

Sentiment Analysis: The Marketplace and Providers
Sentiment Analysis: The Marketplace and ProvidersSentiment Analysis: The Marketplace and Providers
Sentiment Analysis: The Marketplace and Providers
 
The Insight Value of Social Sentiment
The Insight Value of Social SentimentThe Insight Value of Social Sentiment
The Insight Value of Social Sentiment
 
Text Analytics Today
Text Analytics TodayText Analytics Today
Text Analytics Today
 
Text Analytics 2009: User Perspectives on Solutions and Providers
Text Analytics 2009: User Perspectives on Solutions and ProvidersText Analytics 2009: User Perspectives on Solutions and Providers
Text Analytics 2009: User Perspectives on Solutions and Providers
 
The State of Semantics
The State of SemanticsThe State of Semantics
The State of Semantics
 
Technology Frontiers: Text, Sentiment, and Sense
Technology Frontiers: Text, Sentiment, and SenseTechnology Frontiers: Text, Sentiment, and Sense
Technology Frontiers: Text, Sentiment, and Sense
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart Business
 
Social Media AND THE Enterprise Business Intelligence/Analytics Connection
Social Media AND THE Enterprise Business Intelligence/Analytics ConnectionSocial Media AND THE Enterprise Business Intelligence/Analytics Connection
Social Media AND THE Enterprise Business Intelligence/Analytics Connection
 
Text, Content, and Social Analytics: BI for the New World
Text, Content, and Social Analytics: BI for the New WorldText, Content, and Social Analytics: BI for the New World
Text, Content, and Social Analytics: BI for the New World
 
Search, Signals & Sense: An Analytics Fueled Vision
Search, Signals & Sense: An Analytics Fueled VisionSearch, Signals & Sense: An Analytics Fueled Vision
Search, Signals & Sense: An Analytics Fueled Vision
 
Social Data Sentiment Analysis
Social Data Sentiment AnalysisSocial Data Sentiment Analysis
Social Data Sentiment Analysis
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
An Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentationAn Introduction to Text Analytics: 2013 Workshop presentation
An Introduction to Text Analytics: 2013 Workshop presentation
 
Design of multichannel attribution model using click stream data
Design of multichannel attribution model using click stream dataDesign of multichannel attribution model using click stream data
Design of multichannel attribution model using click stream data
 

Similar to Text/Content Analytics 2011: User Perspectives on Solutions and Providers

Privacy and Tracking in a Post-Cookie World
Privacy and Tracking in a Post-Cookie WorldPrivacy and Tracking in a Post-Cookie World
Privacy and Tracking in a Post-Cookie WorldAli Babaoglan Blog
 
Strategies for a High Performance Revenue Cycle
Strategies for a High Performance Revenue CycleStrategies for a High Performance Revenue Cycle
Strategies for a High Performance Revenue Cyclekarthik Venkilot
 
Data Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportData Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportPaul Buzby
 
The Role of Analytics In Defining The Art Of The Possible
The Role of Analytics In Defining The Art Of The PossibleThe Role of Analytics In Defining The Art Of The Possible
The Role of Analytics In Defining The Art Of The PossibleLora Cecere
 
RTS 2012 The Future Railway
RTS 2012 The Future RailwayRTS 2012 The Future Railway
RTS 2012 The Future RailwayAmplified Events
 
SOA A View from the Trenches
SOA A View from the TrenchesSOA A View from the Trenches
SOA A View from the TrenchesTim Vibbert
 
2012 challenge gov - using competitions and awards to spur innovation
2012   challenge gov - using competitions and awards to spur innovation2012   challenge gov - using competitions and awards to spur innovation
2012 challenge gov - using competitions and awards to spur innovationCentro de Sistemas Públicos
 
Benefits of Modern Cloud Data Lake Platform Qubole GCP - Whitepaper
Benefits of Modern Cloud Data Lake Platform Qubole GCP - WhitepaperBenefits of Modern Cloud Data Lake Platform Qubole GCP - Whitepaper
Benefits of Modern Cloud Data Lake Platform Qubole GCP - WhitepaperVasu S
 
Assessing locally focused stability operations
Assessing locally focused stability operationsAssessing locally focused stability operations
Assessing locally focused stability operationsMamuka Mchedlidze
 
DMA Insight: Marketer email tracker 2017
DMA Insight: Marketer email tracker 2017DMA Insight: Marketer email tracker 2017
DMA Insight: Marketer email tracker 2017Filipp Paster
 
DMA (Direct Marketing Association) Email Tracker Study 2017
DMA (Direct Marketing Association) Email Tracker Study 2017DMA (Direct Marketing Association) Email Tracker Study 2017
DMA (Direct Marketing Association) Email Tracker Study 2017Christopher Hughes
 
Rand rr2504z1.appendixes
Rand rr2504z1.appendixesRand rr2504z1.appendixes
Rand rr2504z1.appendixesBookStoreLib
 
Adapting to Urban Heat: A Tool Kit for Local Governments
Adapting to Urban Heat: A Tool Kit for Local GovernmentsAdapting to Urban Heat: A Tool Kit for Local Governments
Adapting to Urban Heat: A Tool Kit for Local GovernmentsJA Larson
 
ArcSight Interactive Discovery (AID) 5.6 Project Guide
ArcSight Interactive Discovery (AID) 5.6 Project GuideArcSight Interactive Discovery (AID) 5.6 Project Guide
ArcSight Interactive Discovery (AID) 5.6 Project GuideProtect724
 

Similar to Text/Content Analytics 2011: User Perspectives on Solutions and Providers (20)

Blockchain in HCM
Blockchain in HCM Blockchain in HCM
Blockchain in HCM
 
Privacy and Tracking in a Post-Cookie World
Privacy and Tracking in a Post-Cookie WorldPrivacy and Tracking in a Post-Cookie World
Privacy and Tracking in a Post-Cookie World
 
Groasis Waterboxx Supports the Growth of Young Plants under Dry Conditions wi...
Groasis Waterboxx Supports the Growth of Young Plants under Dry Conditions wi...Groasis Waterboxx Supports the Growth of Young Plants under Dry Conditions wi...
Groasis Waterboxx Supports the Growth of Young Plants under Dry Conditions wi...
 
Strategies for a High Performance Revenue Cycle
Strategies for a High Performance Revenue CycleStrategies for a High Performance Revenue Cycle
Strategies for a High Performance Revenue Cycle
 
Zara restaurantandlounge
Zara restaurantandloungeZara restaurantandlounge
Zara restaurantandlounge
 
Buisness Plan V1
Buisness Plan V1Buisness Plan V1
Buisness Plan V1
 
Data Science & BI Salary & Skills Report
Data Science & BI Salary & Skills ReportData Science & BI Salary & Skills Report
Data Science & BI Salary & Skills Report
 
The Role of Analytics In Defining The Art Of The Possible
The Role of Analytics In Defining The Art Of The PossibleThe Role of Analytics In Defining The Art Of The Possible
The Role of Analytics In Defining The Art Of The Possible
 
RTS 2012 The Future Railway
RTS 2012 The Future RailwayRTS 2012 The Future Railway
RTS 2012 The Future Railway
 
SOA A View from the Trenches
SOA A View from the TrenchesSOA A View from the Trenches
SOA A View from the Trenches
 
2012 challenge gov - using competitions and awards to spur innovation
2012   challenge gov - using competitions and awards to spur innovation2012   challenge gov - using competitions and awards to spur innovation
2012 challenge gov - using competitions and awards to spur innovation
 
Prueba Nico
Prueba NicoPrueba Nico
Prueba Nico
 
Benefits of Modern Cloud Data Lake Platform Qubole GCP - Whitepaper
Benefits of Modern Cloud Data Lake Platform Qubole GCP - WhitepaperBenefits of Modern Cloud Data Lake Platform Qubole GCP - Whitepaper
Benefits of Modern Cloud Data Lake Platform Qubole GCP - Whitepaper
 
Assessing locally focused stability operations
Assessing locally focused stability operationsAssessing locally focused stability operations
Assessing locally focused stability operations
 
DMA Insight: Marketer email tracker 2017
DMA Insight: Marketer email tracker 2017DMA Insight: Marketer email tracker 2017
DMA Insight: Marketer email tracker 2017
 
DMA (Direct Marketing Association) Email Tracker Study 2017
DMA (Direct Marketing Association) Email Tracker Study 2017DMA (Direct Marketing Association) Email Tracker Study 2017
DMA (Direct Marketing Association) Email Tracker Study 2017
 
Rand rr2364
Rand rr2364Rand rr2364
Rand rr2364
 
Rand rr2504z1.appendixes
Rand rr2504z1.appendixesRand rr2504z1.appendixes
Rand rr2504z1.appendixes
 
Adapting to Urban Heat: A Tool Kit for Local Governments
Adapting to Urban Heat: A Tool Kit for Local GovernmentsAdapting to Urban Heat: A Tool Kit for Local Governments
Adapting to Urban Heat: A Tool Kit for Local Governments
 
ArcSight Interactive Discovery (AID) 5.6 Project Guide
ArcSight Interactive Discovery (AID) 5.6 Project GuideArcSight Interactive Discovery (AID) 5.6 Project Guide
ArcSight Interactive Discovery (AID) 5.6 Project Guide
 

More from Seth Grimes

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingSeth Grimes
 
Creating an AI Startup: What You Need to Know
Creating an AI Startup: What You Need to KnowCreating an AI Startup: What You Need to Know
Creating an AI Startup: What You Need to KnowSeth Grimes
 
NLP 2020: What Works and What's Next
NLP 2020: What Works and What's NextNLP 2020: What Works and What's Next
NLP 2020: What Works and What's NextSeth Grimes
 
Efficient Deep Learning in Natural Language Processing Production, with Moshe...
Efficient Deep Learning in Natural Language Processing Production, with Moshe...Efficient Deep Learning in Natural Language Processing Production, with Moshe...
Efficient Deep Learning in Natural Language Processing Production, with Moshe...Seth Grimes
 
From Customer Emotions to Actionable Insights, with Peter Dorrington
From Customer Emotions to Actionable Insights, with Peter DorringtonFrom Customer Emotions to Actionable Insights, with Peter Dorrington
From Customer Emotions to Actionable Insights, with Peter DorringtonSeth Grimes
 
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AIIntro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AISeth Grimes
 
Text Analytics Market Trends
Text Analytics Market TrendsText Analytics Market Trends
Text Analytics Market TrendsSeth Grimes
 
Text Analytics for NLPers
Text Analytics for NLPersText Analytics for NLPers
Text Analytics for NLPersSeth Grimes
 
Our FinTech Future – AI’s Opportunities and Challenges?
Our FinTech Future – AI’s Opportunities and Challenges? Our FinTech Future – AI’s Opportunities and Challenges?
Our FinTech Future – AI’s Opportunities and Challenges? Seth Grimes
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...Seth Grimes
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AISeth Grimes
 
Classification with Memes–Uber case study
Classification with Memes–Uber case studyClassification with Memes–Uber case study
Classification with Memes–Uber case studySeth Grimes
 
Aspect Detection for Sentiment / Emotion Analysis
Aspect Detection for Sentiment / Emotion AnalysisAspect Detection for Sentiment / Emotion Analysis
Aspect Detection for Sentiment / Emotion AnalysisSeth Grimes
 
Content AI: From Potential to Practice
Content AI: From Potential to PracticeContent AI: From Potential to Practice
Content AI: From Potential to PracticeSeth Grimes
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextSeth Grimes
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseSeth Grimes
 
Sentiment, Opinion & Emotion on the Multilingual Web
Sentiment, Opinion & Emotion on the Multilingual WebSentiment, Opinion & Emotion on the Multilingual Web
Sentiment, Opinion & Emotion on the Multilingual WebSeth Grimes
 
Big Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsBig Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsSeth Grimes
 

More from Seth Grimes (20)

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
Creating an AI Startup: What You Need to Know
Creating an AI Startup: What You Need to KnowCreating an AI Startup: What You Need to Know
Creating an AI Startup: What You Need to Know
 
NLP 2020: What Works and What's Next
NLP 2020: What Works and What's NextNLP 2020: What Works and What's Next
NLP 2020: What Works and What's Next
 
Efficient Deep Learning in Natural Language Processing Production, with Moshe...
Efficient Deep Learning in Natural Language Processing Production, with Moshe...Efficient Deep Learning in Natural Language Processing Production, with Moshe...
Efficient Deep Learning in Natural Language Processing Production, with Moshe...
 
From Customer Emotions to Actionable Insights, with Peter Dorrington
From Customer Emotions to Actionable Insights, with Peter DorringtonFrom Customer Emotions to Actionable Insights, with Peter Dorrington
From Customer Emotions to Actionable Insights, with Peter Dorrington
 
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AIIntro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
Intro to Deep Learning for Medical Image Analysis, with Dan Lee from Dentuit AI
 
Emotion AI
Emotion AIEmotion AI
Emotion AI
 
Text Analytics Market Trends
Text Analytics Market TrendsText Analytics Market Trends
Text Analytics Market Trends
 
Text Analytics for NLPers
Text Analytics for NLPersText Analytics for NLPers
Text Analytics for NLPers
 
Our FinTech Future – AI’s Opportunities and Challenges?
Our FinTech Future – AI’s Opportunities and Challenges? Our FinTech Future – AI’s Opportunities and Challenges?
Our FinTech Future – AI’s Opportunities and Challenges?
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 
Classification with Memes–Uber case study
Classification with Memes–Uber case studyClassification with Memes–Uber case study
Classification with Memes–Uber case study
 
Aspect Detection for Sentiment / Emotion Analysis
Aspect Detection for Sentiment / Emotion AnalysisAspect Detection for Sentiment / Emotion Analysis
Aspect Detection for Sentiment / Emotion Analysis
 
Content AI: From Potential to Practice
Content AI: From Potential to PracticeContent AI: From Potential to Practice
Content AI: From Potential to Practice
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's Next
 
Global Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and SenseGlobal Analytics: Text, Speech, Sentiment, and Sense
Global Analytics: Text, Speech, Sentiment, and Sense
 
Sentiment, Opinion & Emotion on the Multilingual Web
Sentiment, Opinion & Emotion on the Multilingual WebSentiment, Opinion & Emotion on the Multilingual Web
Sentiment, Opinion & Emotion on the Multilingual Web
 
Big Data Analytics: Facts and Feelings
Big Data Analytics: Facts and FeelingsBig Data Analytics: Facts and Feelings
Big Data Analytics: Facts and Feelings
 

Recently uploaded

Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Recently uploaded (20)

Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

Text/Content Analytics 2011: User Perspectives on Solutions and Providers

  • 1. Text/Content Analytics 2011: User Perspectives on Solutions and Providers Seth Grimes An Alta Plana research study Sponsored by Published September 9, 2011 under the Creative Commons Attribution 3.0 License.
  • 2. Text/Content Analytics 2011: User Perspectives Table of Contents Executive Summary ............................................................................................................................................ 3 Market Size and Growth............................................................................................................................. 3 Growth Drivers ........................................................................................................................................... 3 The 2011 Market ........................................................................................................................................ 4 The Study.................................................................................................................................................... 4 Key Study Findings...................................................................................................................................... 4 About the Study and the Report ................................................................................................................ 5 Text and Content Analytics Basics ...................................................................................................................... 6 From Patterns… .......................................................................................................................................... 6 … To Structure ............................................................................................................................................ 7 Beyond Text................................................................................................................................................ 7 Metadata .................................................................................................................................................... 7 A Focus on Applications ............................................................................................................................. 7 Applications and Markets .................................................................................................................................. 8 Application modes...................................................................................................................................... 8 Business Domains ....................................................................................................................................... 8 Business Functions ..................................................................................................................................... 9 Technology Domains ................................................................................................................................ 10 Solution Providers .................................................................................................................................... 12 Demand-Side Perspectives ............................................................................................................................... 13 Study Context ........................................................................................................................................... 13 About the Survey ...................................................................................................................................... 13 Market Size and the Larger BI Market ...................................................................................................... 15 The Data Mining Community ................................................................................................................... 16 Demand-Side Study 2011: Findings .................................................................................................................. 17 Q1: Length of Experience ......................................................................................................................... 17 Q2: Application Areas ............................................................................................................................... 18 Q3: Information Sources .......................................................................................................................... 19 Q4: Return on Investment ........................................................................................................................ 21 Q5: Mindshare.......................................................................................................................................... 22 Q6: Spending ............................................................................................................................................ 23 Q8: Satisfaction ........................................................................................................................................ 23 Q9: Overall Experience ............................................................................................................................. 25 Q10: Providers .......................................................................................................................................... 28 Q11: Provider Selection ............................................................................................................................ 29 Q13: Promoter? ........................................................................................................................................ 31 Q14: Information Types ........................................................................................................................... 32 Q15: Important Properties and Capabilities ............................................................................................ 32 Q16: Languages ........................................................................................................................................ 34 Q17: BI Software Use ............................................................................................................................... 35 Q18: Guidance .......................................................................................................................................... 36 Q19: Comments ....................................................................................................................................... 39 Additional Analysis ................................................................................................................................... 40 Interpretive Limitations and Judgments .................................................................................................. 42 About the Study ............................................................................................................................................... 43 Solution Profile: AlchemyAPI ............................................................................................................................ 45 Solution Profile: Attensity ................................................................................................................................. 47 Solution Profile: Basis Technology .................................................................................................................... 49 Solution Profile: Language Computer Corp. ..................................................................................................... 51 Solution Profile: Lexalytics ................................................................................................................................ 53 Solution Profile: Medallia ................................................................................................................................. 55 Solution Profile: SAS ......................................................................................................................................... 57 Solution Profile: Sybase .................................................................................................................................... 59 Solution Profile: Verint Systems Inc.................................................................................................................. 61 2
  • 3. Text/Content Analytics 2011: User Perspectives Executive Summary Text and content analytics have become a source of competitive advantage, enabling businesses, government agencies, and researchers to extract unprecedented value from “unstructured” data. Uptake is strong – software, solutions, and services are delivering significant business value to users in a spectrum of industries – yet the potential of the market remains unreached. These points and more are brought out in Alta Plana’s market study, “Text/Content Analytics 2011: User Perspectives on Solutions and Providers.” Market Size and Growth Tools and solutions now cover the gamut of business, research, and governmental needs. User adoption continues to grow at a very rapid pace, an estimated 25% in 2010, creating an $835 million market for software tools, business solutions, and vendor supplied support and services. These tools and solutions generate business value several times that figure, extrapolating from revenue generated by applications and solutions (for instance, social-media analysis, e-discovery, and search), information products created by mining content, professional services, and research. The addressable market for text/content analytics is much larger. The technologies are a subset of a larger business intelligence, analytics, and performance management software market, which is dominated by solutions that analyze numerical data that originates in enterprise operational systems. Gartner estimated that larger market at $10.5 billion globally in 2010. Yet, given now-broad awareness of the business value that resides in “unstructured” social, online, and enterprise sources, text/content-analytics’ share of the much larger market will surely grow steeply in coming years. Overall, expect annual text/content-analytics growth averaging up to 25% for the next several years. Growth Drivers A number of factors contribute to sustained growth, foremost the growth of social platforms, which have become essential life tools for individuals and an important business marketing, communication, research, and commerce channel. Social Keeping up with Social is a must for every consumer-facing organization, and automated monitoring, measurement, and engagement is the only way to deal with Social’s variety, volume, and velocity. Leading solutions rely on natural-language processing, provided by text/content analytics, to identify and extract facts and sentiment. Expect even lower-end tools to embrace NLP by 2013. Publishing, advertising, and information services Second, text/content analytics is central to competitive online publishing and advertising to effective information access (essentially, next-generation search). These are two sides of a single coin. As applied by content producers and publishers, technologies discover and associate appropriate descriptive and semantic labels with content. The aims are to optimize search findability, to allow content to be stored and retrieved at a fine-grained level (documents as databases), and to enhance the content consumer’s experience interacting with content. As applied by search, content aggregation, online advertising, and information-service providers, the technology fuels situationally appropriate results that respond to the information/service seeker’s context and intent. 3
  • 4. Text/Content Analytics 2011: User Perspectives Question-answering and information access Question-answering systems such as IBM Watson and Wolfram Alpha are examples of next-generation, analytics-enabled information-access engines, which will play a key role in online commerce, customer support, health-service delivery, and other applications starting by early 2013. Similarly, Semantic Web information resources should finally enter the mainstream by 2014. They will very frequently rely on analytics to semanticize and structure content and support on-the-fly information integration. Rich media Last, content analytics makes sense of rich media. The technology finds and exploits patterns – what’s in a given piece of content and how the content of content changes over time – in speech and sound, images, and video. There are important today content- analytics applications for contact centers, security, general information access, and even in consumer electronics: Witness face detection and tracking in consumer-grade cameras and camcorders. Arguably, we could include analyses of social and enterprise network, mined from e-mail, messaging, online, and social content, under the content-analytics umbrella. The 2011 Market As in prior years, no single solution provider dominates the market. Players range from the largest enterprise software vendors to a stream of new entrants, both commercializing research technologies and bringing solutions to new markets. In between, established enterprise content management (ECM), BI and analytics, search, software tools, and business-solution providers – the sponsors of this study among them – continue to innovate and deliver business value. The Study Alta Plana’s 2011 text/content analytics market study combines a survey-based, quantitative and qualitative examination of usage, perceptions, and plans with observations derived from numerous conversations with solution providers and users. It seeks to answer the question, “What do current and prospective text/content-analytics users really think of the technology, solutions, and solution providers?” Responses will help providers craft products and services that better serve users. Findings will guide users seeking to maximize benefit for their own organizations. Alta Plana received 224 valid survey responses between June 6 and July 9, 2011. This document reports findings and when appropriate, contrasts them with comparable numbers from Alta Plana’s spring-2009 text-analytics market study.1 Key Study Findings The following are key 2011 study findings:  The big news is not news at all: Social is by far the most popular source fueling text/content analytics initiatives. Four of the top 5 information categories are social/online (as opposed to in-enterprise) sources: o blogs and other social media (62%) o news articles (41%) o on-line forums (35%) o reviews (30%) 1 “Text Analytics 2009: User Perspectives on Solutions and Providers”: http://altaplana.com/TA2009 4
  • 5. Text/Content Analytics 2011: User Perspectives as well as direct customer feedback in the form of: o customer/market surveys (35%) o e-mail and correspondence (29%) for an average of 4.5 sources per respondent.  All three top capabilities that users look for in a solution, each garnering over 50% response, relate to getting the most information out of sources: o Broad information extraction capabilities (63%) o Ability to use specialized dictionaries, taxonomies, ontologies, or extraction rules (57%) o Deep sentiment/emotion/opinion extraction (57%) Low cost dropped from 51% of 2009 responses to 38% in 2011.  Top business applications of text/content analytics for respondents are the following: o Brand / product / reputation management (39% of respondents) o Voice of the Customer / Customer Experience Management (39%) o Search, Information Access, or questions Answering (36%) o Competitive intelligence (33%)  Seventy percent of users are Satisfied or Completely Satisfied with text/content analytics and 24% are Neutral with only 7% Disappointed or Very Disappointed. Dissatisfaction is greatest, at 25%, with ease of use, with only 36% satisfied. Only 42% are satisfied with availability of professional services/support.  Only 49% of users are likely to recommend their most important provider. 28% would recommend against their most important provider. About the Study and the Report Seth Grimes, an industry analyst and consultant who is a recognized authority on the application of text analytics, designed and conducted the study “Text/Content Analytics 2011: User Perspectives on Solutions and Providers” and wrote this report. The author is grateful for the support of the nine study sponsors, Verint, Sybase, SAS, Medallia, Lexalytics, Language Computer Corporation, Basis Technology, Attensity, and AlchemyAPI. Their sponsorships allowed him to conduct an editorially independent study that should promote understanding of the text/content analytics market and of user- indicated implementation and operations best practices. The solution profiles that follow the report’s editorial matter were provided by the sponsors and included with only minor editing for to regularize their layout. Otherwise, the author is solely responsible for the editorial content of this report, which was not reviewed by the sponsors prior to publication. 5
  • 6. Text/Content Analytics 2011: User Perspectives Text and Content Analytics Basics The term text analytics describes software and transformational processes that uncover business value in “unstructured” text via the application of statistical, linguistic, machine learning, and data analysis and visualization techniques. The aim is to improve automated text processing, whether for search, classification, data and opinion extraction, business intelligence, or other purposes. Rough synonyms include text mining, text ETL, and semantic analysis. Terminology choices are typically rooted in history and competitive positioning. Text mining is an extension of data mining and text ETL of the BI world’s extract-transform-load concept. Semantic analysis seems most often used by Semantic Web aficionados, who sometimes use the broader term Semantic Web technologies, which also covers protocols such as RDF, triple stores, query systems, and the like. These text technologies all perform some form of natural language processing (NLP). Content analytics can and should be seen as an extension of capabilities to also cover images, audio and speech, video, and composites, the gamut of information types not generated or held in data fields. (Some organizations use the content analytics label for text analytics on online, social, and enterprise content, typically, published information. These organizations most often have a strong focus on enterprise content management (ECM) systems.) From Patterns… Text, images, speech and other audio, and video are all directly understandable by humans (although not universally: Any given human language – English, Japanese, or Swahili – is spoken by a minority of people, and not everyone recognizes a Beethoven symphony or Nelson Mandela in a photo). Understanding relies on three capabilities: 1) Ability to recognize small- and large-scale patterns. 2) Ability to grasp context and, from context, to infer meaning. 3) Ability to create and apply models. Descriptive statistics provides an NLP starting point: The most frequently used words and terms give an indication of the topics a message or document is about. We can create categories and classify text (a form of modeling) based on notions of statistical similarity. Next steps take advantage of the linguistic structure of text, detectable by machines as patterns. We have word form (“morphology”) and arrangement (grammar and syntax) as well as higher-level narrative and discourse. Usage may be correct (as judged by editors, grammarians, and linguists) or not, whether the language is spoken, formally written, or texted or tweeted: The most robust technologies deal with text in the wild. We apply assets such as lexicons of “named entities”; part-of-speech resolution that can help identify subject, object, relationship, and attributes; and “word nets” that associate words to help in disambiguation, determination of the contextual sense of terms that may have different meanings in different contexts. Yet, in the words of artificial-intelligence pioneer Edward A. Feigenbaum, “Reading from text in general is a hard problem, because it involves all of common sense knowledge. But reading from text in structured domains, I don’t think is as hard.” So some techniques (also) apply knowledge representations such as ontologies to the analysis task. All techniques, however, aim to generate machine-processable structure. 6
  • 7. Text/Content Analytics 2011: User Perspectives … To Structure NLP outputs, as part of a text-analytics system, are typically expressed in the form of document annotations, that is, in-line or external tags that identify and describe features of interest. Outputs may be mapped into machine-manageable data structures whether relational database records or in XML, JSON, RDF, or another format. Text-extracted data represented in the Semantic Web’s Resource Description Framework (RDF) may form part of a Linked Data system. Text-derived information stored in a relational database may become part of a business intelligence system that jointly analyzes, for instance, DBMS-captured customer transactions and free-text responses to customer-satisfaction surveys. And text-extracted features such as entities, topics, dates, and measurement units may form the basis of advanced semantic search systems. Beyond Text Beyond-text technologies for information-extraction from images, audio, video, and composite media exist but do not match NLP’s sense-making capabilities. Likely most developed is speech-analysis technology that supports indexing and search using phonemes and is capable of detecting emotion in speech via analysis of indicators such as pace, volume, and intonation with contact-center and others applications that include intelligence. Intelligence, along with consumer and social search, motivates work on image analysis, as do marketing and competitive-intelligence related studies of online and social brand mentions and use. Video analytics extends both speech and image analysis, with an added temporal aspect, for security applications and also potential business uses such as study of customer in-store behavior. For beyond-text media, as for text, metadata is of critical importance. Metadata Metadata describes data properties that may include the provenance, structure, content, and use of data points, datasets, documents, and document collections. Content-linked metadata typically includes author, production and modification dates, title, topic(s), keywords, format, language, encoding (e.g., character set), rights, and so on. The metadata label extends to specialized annotations such as part-of-speech and data type. Metadata may be created as part of content production or publication (for instance, the save date captured by a word-processor, a geotag associated with a social update, camera information stored in an image file). It may be appended (for instance via social tagging), or extracted from content via text/content analysis. Whether stored internally within a data object (for instance via RDFa, FOAF, or other microformats embedded in a Web page) or managed externally, in a database or search index, metadata is fuel for a range of applications. A Focus on Applications We will not devote further space in this report to discussion of text- and content-analysis technology. If you do want to learn more about text-analytics history and technology, do continue with the technology sections of Alta Plana’s 2009 study report, “Text Analytics 2009: User Perspectives on Solutions and Providers,” available online at http://altaplana.com/TextAnalyticsPerspectives2009.pdf. As a bridge to survey-derived reporting of user perceptions of the text and content analytics market, solutions, and providers, we will look next at applications. 7
  • 8. Text/Content Analytics 2011: User Perspectives Applications and Markets Business users naturally focus on business benefits, whether of analytics or of any other technology or investment. Who are those users? Text and content analytics solutions have a place a) in any business domain, b) for any business function, and c) within any technology stack, that would benefit from automated text/content handling, that is, wherever text/content volume, velocity, and variety, and business urgency, are sufficient to justify costs. Consider a very telling quotation, however: Philip Russom of the Data Warehousing Institute wrote in a 2007 report, “BI Search and Text Analytics: New Additions to the BI Technology Stack,”2 “Organizations embracing text analytics all report having an epiphany moment when they suddenly knew more than before.” In the analytics world, we see now that it is not enough to know more. You need to understand how to use knowledge gained, the processes and outcomes necessary to turn insights into ROI. Text and content analytics elements – information sources, insights sought, processes, and ROI measures – will vary by industry and application. In this report section, by way of lead-in to survey findings – applications, information sources, and ROI measures are the subject of survey questions 3, 4, and 5 – we look at text/content analytics adaptation for applications in several industries and for a variety of business functions. Application modes Applications are diverse but may be classified in several (overlapping) groups. Our categorization is an update of 2009’s with social and online addition in particular:  Media, knowledgebase, and publishing systems – the author includes search engines here – use text and content analytics to generate metadata and enrich and index metadata and content in order to support content distribution and retrieval. Semantic Web applications would fit in this category, as would emerging information-access engines.  Content management systems – and, again, related search tools – use text analytics to enhance the findability of content for business processes that include compliance, e-discovery, and claims processing.  Line-of-business and supporting systems for functions such as compliance and risk, customer experience management (CEM), customer support and service, marketing and market research, human resources and recruiting… and newer tasks that include social monitoring, measurement, and engagement.  Investigative and research systems for functions such as fraud, intelligence and law enforcement, competitive intelligence, and science. Where are these applications used? Business Domains Consider a sampling of industry domains where text and content analytics are frequently applied:  In intelligence and counter-terrorism, and in law enforcement, there is broad content variety – languages, format (text, audio, images, and video), sources (news, field reports, communications intercepts, government records, social 2 http://www.teradata.com/assets/0/206/308/96d9065a-0240-44f1-b93c-17e08ae6eacc.pdf 8
  • 9. Text/Content Analytics 2011: User Perspectives postings) – and, at times, great urgency.  In life sciences, for instance for pharmaceutical drug discovery, source materials have been more uniform (scientific literature, clinical reports) and there is no need for real-time response, yet information volumes are huge and complex and the potential payoff – years and millions of dollars shaved off lead-generation and clinical trials processes – to justify very significant investments in text mining.  For financial services and insurance, effective credit, risk, fraud, and legal and regulatory-compliance decision-making involves creation of predictive models via analysis of large volumes of transactional records and often incorporates information mined from text sources such as financial and news reports, e-mail and corporate correspondence, insurance and warranty claims. Automated methods are essential.  Market researchers rely on text analytics to hear and understand market voices. Focus groups are (on their way) out: They are costly, slow, and often unreliable. Surveys still have great value – beyond soliciting opinions, they can serve as an engagement tool – but neither they nor focus groups help researchers hear unprompted views, the attitudes that consumers express to their peers but not in more formal research settings. Why text analytics? Social is hot, yet human analysis, whether or surveys or of social postings, can be inconsistent and don’t scale. Add in text analytics and you have next-generation market research.  As content delivery and consumption shift to digital, search and information- dissemination tools that exploit metadata (publisher-produced, analytically generated metadata, or socially tagged) are essential survival tools for media and publishing organizations. Content analytics creates better targeted, richer content and a much friendlier and more powerful experience for content consumers.  Online and social have fomented an advertising revolution. Targeting is the word, whether based on behaviors (modeled via tracking and clickstream analysis) or on analytically computed matching. Matches may draw from user profiles, context (geography, accessing application, device or machine being used) and inferred intent (for instance from search terms), and the semantic-signatures of the content where ads are to be delivered.  Text analytics provides essential capabilities in support of legal domain e- discovery mandates. Organizations must “produce” materials relevant to lawsuits, a task that would often be impossible without automated text processing, given huge volumes of electronically stored information generated in the course of business. Intellectual property is another legal-domain application. The task is to identify names, terminology, properties, and functions salient to an IP search that seeks to identify, for instance, prior art and possible patent infringement. Business Functions Many business tasks are independent of industry. Every organization of any significant size has in-house customer support, marketing, product development, and similar functions (even while definitions of customer, marketing, and product do still, of course, vary by industry.) Let’s examine the role text and content analytics play for the following:  Customer experience management (CEM) is a signal text/content analytics success story. The aim is to transform customer relationship management (CRM), which captures transactions and interactions, into a set of tools and practices that cover the engagement span from customer acquisition to customer service and 9
  • 10. Text/Content Analytics 2011: User Perspectives support, first and foremost by listening and responding to the voice of the customer across channels. In plain(er) English, CEM marries text- and speech- sourced information – from e-mail, online forums, surveys, contact-center conversations, and other touchpoints… and also from employee input – with transactional and profile information. The hope is to improve customer satisfaction and operate more efficiently and profitably. Simplistic, reductive indicators such as the Net Promoter Score can only point at issues and challenges. They can neither explain them nor suggest actions or remedies – insights that are accessible (at enterprise scale) only via text and content analytics.  Marketers translate market-research and competitive-intelligence findings into marketing campaigns and advertising and, in cooperation with product developers, into higher quality, more satisfying products and services. It’s all about listening. Steve Rappaport, in his book Listen First!, says we should “Change the research paradigm. Social media listening research should bring about an era of real-time data that anticipates change and can be used to visualize and create a rewarding business future,” as well as “rethink marketing, advertising, and media.” His prescriptions about listening apply across channels and touchpoints, as they do for CEM, with the difference here, for research-related functions, being that we are looking at an aggregate rather than an individualized picture, seeking to hear the voice of the market, again aided by text and content analytics. Our aim is to deliver targeted, compelling advertising via more effective marketing and, of course, superior products and services that better meet customer needs.  Competitive intelligence, in particular, involves mining customer voices, at both individual and aggregate levels, and also business information, for instance about sales, personnel, alliances, and market conditions that indicate opportunities and threats. Ability to extract domain- and sector-focused information from online and social sources and to integrate information from disparate sources in order to derive coherent signals is essential, delivered by analytically rooted technologies.  Business intelligence (BI) was first defined, in the late 1950s, in terms of extraction and reuse of knowledge drawn from textual sources.3 BI took off in a different direction, however, starting in the late 1960s, centering on analysis of numerical data captured in computerized corporate operational and transactional systems. Back to the (1950s) Future: Number crunchers of all stripes recognize the business value of information in text sources. They are seeking, with the help of both major and niche BI and data warehousing vendors, to bring text-sources information into enterprise BI initiatives. Call this integrated analytics, also incorporating geospatial and machine-generated Big Data to bring businesses a step closer to the sought-after (although mythical) 360o-view of the customer (and the market and one’s own business). Technology Domains Last, for context, let’s briefly consider technology domains where text and content analytics come into play, semantics and the Semantic Web, and then look at emerging text analytics applications. Semantic Computing First, redefining, text/content analytics involves the acquisition, processing, analysis, and 3 Seth Grimes, “BI at 50 Turns Back to the Future,” InformationWeek, November 21, 2008: http://www.informationweek.com/news/software/bi/211900005 10
  • 11. Text/Content Analytics 2011: User Perspectives presentation of enterprise, online, and social information derived from text and rich- media sources. The technology is one route to semantics, to generating machine-usable identification of information objects attached to databases, tables, fields, and rows; to corpora, documents, and document content; and to media files, e-mail and text messages. Text/content analytics provides a descriptive route to semantics, making sense of information in-the-wild, as generated by humans (and machines) online, on social platforms, and in everyday business and personal communications, whether written, spoken, or captured in rich media. The alternative route to semantics is prescriptive, generated or captured in the course of content generation, whether via database export or a plug-in to an authoring application. The Semantics market includes technologies for the creation, management, and use of artifacts such as taxonomies, ontologies, thesauruses, gazetteers, semantic networks, controlled vocabularies, and metadata. These artifacts may be generated manually by subject-matter experts. They may be generated automatically by text analytics. And in many situations, a hybrid system involving manual curation of automatically generated artifacts may be in order. Semantics applications include digital content management, publishing, research, and librarianship across a broad set of industrial and government applications. The semantics market includes semantic search, whether open-domain, vertical (applied to a particular information domain), or horizontal (applied in a particular business function). It also includes classification and information integration. Classification, Search, and Integration Semantic computing finds its primary application in classification, search, and integration. Classification determines what a data item or object represents, including how it may be used, in relation to other data items and objects in a data space. This is, admittedly, an abstract and not particularly practical definition. Information integration and search are where semantics finds its most compelling applications. Semantic search is, in essence, “search made smarter, search that seeks to boost accuracy by taming ambiguity via an understanding of context.”4 Several approaches fit under the semantic search umbrella. They include related searches, search-results enrichment, concept searches, faceted search, and more. The common thread is better matching searcher intent (inferred from search context including past searches and the searcher’s profile) to searched-for information content. Semantic search is behind many emerging search-based applications, fueled by text and content analytics, for applications such as e-discovery, faceted navigation for online commerce, and search-driven business intelligence. And it is captured semantics, in the form of data identifiers and descriptions, that enables dynamic, adaptive information integration, where join paths are discovered based on business and application needs, not hard-wired as in until-recent computing generations. The Semantic Web The Semantic Web is, at its root, an information-integration and sharing application, a set of standards and protocols designed to facilitate creation and use of “Web of data.” Eventually, the Semantic Web market will include tools and services that execute knowledge-reliant business transactions over distributed, semantically infused data spaces. We are years from that market. The bulk of Semantic Web focused expenditures are for government funded research 4 Seth Grimes, “Breakthrough Analysis: Two + Nine Types of Semantic Search,” InformationWeek, January 21, 2010: http://www.informationweek.com/news/software/bi/222400100 11
  • 12. Text/Content Analytics 2011: User Perspectives projects at universities and similar institutions. Outside research contexts, business implementations do not extend significantly beyond a) the use of microformats and RDFa (Resource Description Framework–attributes) to allow Web-published structured data to be indexed by search engines to facilitate information access and b) the use of RDF triples as a convenient format for structuring facts for storage in DBMSes supporting graph- database schemas to facilitate integration and query of data from disparate sources. At a certain point however, perhaps in 2-4 years, the Semantic Web will reach a tipping point where its business value, and the revenues generated by technology and solutions sales, licensing, and support, will explode. Value Today At this time, text/content analytics delivers business value that is greater by far than the value delivered by related semantic and Semantic Web technologies. This is because the vast majority of subject information – text, images, audio, and video (a.k.a. content) – is in “unstructured” form, just a string of bytes (and terms, in the case of text) so far as software systems – Web browsers and office productivity tools, content management systems, search engines – are concerned. To make content tractable for business ends, for operational or analytical purposes or in order to monetize content as a product, one must first create structure. To maximize content usability, for most social and for many enterprise sources, generated structure will take into account semantic information extracted from source materials. That is, structure shouldn’t be arbitrary, a matter of sticking information into a set of round pigeonholes for square-peg content. This process of the discovery, extraction, and use of semantic information in content is the domain of text/content analytics solutions. Solution Providers The aggregate characteristics of the text and content analytics solution-provider spectrum are little changed since 2009 although there has been significant turn-over in players. We still have, as reported in 2009, “a significant cadre of young pure-play software vendors, software giants that have built or acquired text technologies, robust open-source projects, and a constant stream of start-ups, many of which focus on market niches or specialized capabilities such as sentiment analysis.” The big change is in delivery mode. The market now favors as-a-service analytics, whether in the form of online applications, cloud provisioned, or provided via Web application programming interfaces (APIs). This shift makes sense.  The most in-demand new information sources are online, social, and on-cloud.  Use of as-a-service, cloud, and via-API applications means low up-front investment, faster time to use, and pay-as-you-go pricing without IT involvement.  Certain providers offer as-a-service access to both historical and current data at attractive costs given the buy-once, sell-many-times economies they enjoy.  Modern applications are designed to draw data via APIs, facilitating application- inclusion of plug-in text and content analytics capabilities. There is every expectation that the solution-provider market will continue to evolve to keep pace with user needs and broad-market business and technical trends. 12
  • 13. Text/Content Analytics 2011: User Perspectives Demand-Side Perspectives Alta Plana designed a 2011 survey, “Text/Content Analytics demand-side perspectives: users, prospects, and the market,” to collect raw material for an exploration of key text- analytics market-shaping questions:  What do customers, prospects, and users think of the technology, solutions, and vendors?  What works, and what needs work?  How can solution providers better serve the market?  Will your companies expand their use of text analytics in the coming year? Will spending on text/content analytics grow, decrease, or remain the same? It is clear that current and prospective text/content-analytics users wish to learn how others are using the technology, and solution providers of course need demand-side data to improve their products, services, and market positioning, to boost sales and better satisfy customers. The Alta Plana study therefore has two goals:  To raise market awareness and educate current and prospective users.  To collect information of value to solution providers, both study sponsors and non-sponsors. Survey findings, as presented and analyzed in this study report, provide a form of measure of the state of the market, a form of benchmark. They are designed to be of use to everyone who is interested in the commercial text/content-analytics market. Study Context The author previously explored market questions in a number of papers and articles. These included white papers created for the Text Analytics Summit in 2005, The Developing Text Mining Market,”5 and 2007, “What's Next for Text.”6 A systematic look at the demand side provides a good complement to provider-side views and to vendor- and analyst-published case studies, including the author’s own. This understanding motivated the 2009 study, “Text Analytics 2009: User Perspectives on Solutions and Providers,” available for free download.7 That research was preceded by Alta Plana’s 2008 study report, “Voice of the Customer: Text Analytics for the Responsive Enterprise,”8 published by BeyeNETWORK.com, a first systematic survey of demand-side perspectives, albeit focused on a particular set of business problems. VoC analysis is frequently applied to enhance customer support and satisfaction initiatives, in support of marketing, product and service quality, brand and reputation management, and other enterprise feedback initiatives. About the Survey There were 224 responses to the 2011 survey, which ran from June 6 to July 9, 2011. (Contrast with 116 responses to the 2009 survey, which ran from April 13 to May 10, 2009.) 5 http://altaplana.com/TheDevelopingTextMiningMarket.pdf 6 http://altaplana.com/WhatsNextForText.pdf 7 http://altaplana.com/TA2009 8 http://altaplana.com/BIN-VOCTextAnalyticsReport.pdf 13
  • 14. Text/Content Analytics 2011: User Perspectives Survey invitations The author solicited responses via  E-mail to the TextAnalytics, SentimentAI, Corpora, Lotico, BioNLP, Information- Knowledge-Content-Management, and ContentStrategy lists and the author’s personal list.  Invitations published in electronic newsletters: InformationWeek, BeyeNETWORK, CMSWire, KDnuggets, AnalyticBridge, and Text Analytics Summit.  Notices posted to LinkedIn forums and Facebook groups and on Twitter.  Messages sent by sponsors to their communities. Survey introduction The survey started with a definition and brief description as follow: Text Analytics / Content Analytics is the use of computer software or services to automate • annotation and information extraction from text – entities, concepts, topics, facts, and attitudes, • analysis of annotated/extracted information, • document processing – retrieval, categorization, and classification, and • derivation of business insight from textual sources. This is a survey of demand-side perceptions of text technologies, solutions, and providers. Please respond only if you are a user, prospect, integrator, or consultant. There are 21 questions. The survey should take you 5-10 minutes to complete. For this survey, text mining, text data mining, content analytics, and text analytics are all synonymous. I'll be preparing a free report with my findings. Thanks for participating! Seth Grimes (grimes@altaplana.com, +1 301-270-0795) The introduction ended with the text: Privacy statement: This survey records your IP address, which we will use only in an effort to detect bogus responses. It is your choice whether to provide your name, company, and contact information. That information will not be shared with sponsors without your permission, and if shared with sponsors, it will not be linked to your survey responses. 14
  • 15. Text/Content Analytics 2011: User Perspectives Survey response There is little question that the survey results overweight current text-analytics users – 73% of respondents who answered Q1, “How long have you been using Text Analytics?” (n=224) versus 78% of respondents who replied to Q7, “Are you currently using text/content analytics?” (n=206) – among the broad set of potential business, government, and academic users. (The difference in percentage is likely due to a higher rate of survey abandonment among non-users. The figures contrast with 63% and 61% in the 2009 survey.) So call this a Pac Man question, one whose response indicates very significant survey selection bias: Are you currently using text/content analytics? Yes No 21.8% 78.2% (n=206) Market Size and the Larger BI Market We can infer overweighting by comparing market-size figures. The author estimates an $835 million 2010 global market for text/content-analytics software and vendor supplied support and services. As the author described in the May 12, 2011 InformationWeek article Text-Analytics Demand Approaches $1 Billion9, “My $835 million market-size estimate covers software licenses, service subscriptions, and vendor-provided technical support and professional services. Despite strong growth, it remains a small fraction of Gartner's $10.5 billion 2010 valuation of the broader BI, analytics, and performance- management software market.”10 By contrast, the 2009 text-analytics market report cited the author’s figure of $350 million for the global, 2008 text analytics market. (That figure did not account for search-based applications, which were included in the 2010 market-size estimate.) The 2009 report also cited a 2008 BI-market estimate from research firm IDC: “The business intelligence tools software market grew 6.4% in 2008 to reach $7.5 billion.”11 9 http://www.informationweek.com/news/software/bi/229500096 10 http://www.gartner.com/it/page.jsp?id=1642714 11 http://www.idc.com/getdoc.jsp?containerId=217443 15
  • 16. Text/Content Analytics 2011: User Perspectives The Data Mining Community Another contrasting data point is that 65% of respondents to a July 2011 KDnuggets poll12 report (n=121) using text analytics on projects in the preceding year. Results were tallied nine days into the poll, before it was closed, so final numbers may differ from those reported here. The figure in a similar, March 2009 poll was 55% currently using text analytics/text mining. KDnuggets: How much did you use text analytics / text mining in the past 12 months? Used on over 50% of my projects 21.5% Used on 26-50% of my projects 9.9% Used on 10-25% of projects 14.9% Used on < 10% of my projects 19.0% Did not use 34.7% 0% 5% 10% 15% 20% 25% 30% 35% 40% KDnuggets reaches data miners, a technically sophisticated audience who are among the most likely of any market segment to have embraced text analytics. The rate of text- analytics adoption by data miners surely exceeds the rate adoption by any other user sector. As an aside, 49% of KDnuggets respondents stated that in comparison to the last 12 months, in the next 12 they would use text analytics more, whether on additional projects or more intensively on a steady project workload. 43% stated their use would remain about the same and only 8% anticipated less use. 12 http://www.kdnuggets.com/2011/07/poll-text-analytics-use.html 16
  • 17. Text/Content Analytics 2011: User Perspectives Demand-Side Study 2011: Response The subsections that follow tabulate and chart survey responses, which are presented without unnecessary elaboration. Q1: Length of Experience As in 2009, the 2011 survey opened with a basic question – How long have you been using Text/Content Analytics? 35% 30% 25% 20% 15% 10% 5% 0% not using, 6 months to one year to two years to currently less than 6 four years no definite less than less than less than evaluating months or more plans to use one year two years four years 2009 (n=107) 16% 22% 8% 5% 7% 18% 25% 2011 (n=224) 6% 21% 3% 5% 12% 20% 33% We see that 2011 responses skew to longer experience than measured in 2009. Survey results were not based on a scientifically designed or measured population sample however, neither in 2011 nor in 2009, and given how out of proportion survey-measured experience is to that of the broad business population – the addressable market for text/content analytics likely extends far beyond the currently user base – the most plausible conclusion one can draw from Q1 responses is that 2011 survey outreach failed to bring in the proportion of new and prospective users reached in 2009. Nonetheless, Q1 responses will prove illuminating in analyses of subsequent survey questions, in studying how attitudes vary by length of text/content analytics experience. 17
  • 18. Text/Content Analytics 2011: User Perspectives Q2: Application Areas What are your primary applications where text comes into play? 39% Brand/product/reputation management 40% Voice of the Customer / Customer Experience 39% Management 33% 39% Search, information access, or Question Answering 36% Research (not listed) 33% 33% Competitive intelligence 37% 26% Customer service/CRM 22% Product/service design, quality assurance, or warranty 15% claims 14% 15% Life sciences or clinical medicine 18% 2011 (n=219) 15% E-discovery 2009 (n=103) 15% Online commerce including shopping, price intelligence, 11% reviews 10% Financial services/capital markets 15% 9% Other 13% 8% Insurance, risk management, or fraud 17% 8% Content management or publishing 19% 7% Military/national security/intelligence 6% Law enforcement 7% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% The 219 respondents in 2011 chose a total of 748 primary applications, an average of 3.4 primary applications per respondent. While there is some category overlap, it is notable that respondents are applying text analytics toward multiple business needs. 18
  • 19. Text/Content Analytics 2011: User Perspectives Q3: Information Sources What textual information are you analyzing or do you plan to analyze? 62% blogs and other social media 47% 41% news articles 44% 35% on-line forums 35% 35% customer/market surveys 34% 30% review sites or forums 21% 29% e-mail and correspondence 36% 27% scientific or technical literature 27% 23% contact-center notes or transcripts 25% 22% Web-site feedback 21% 21% text messages/SMS/chat 8% 15% 2011 (n=215) employee surveys 16% 14% 2009 (n=100) field/intelligence reports 14% speech or other audio 12% crime, legal, or judicial reports or evidentiary materials 13% 10% medical records 16% 9% point-of-service notes or transcripts 12% 9% patent/IP filings 11% 8% photographs or other graphical images 7% insurance claims or underwriting notes 15% 6% video or animated images 5% warranty claims/documentation 7% 0% 10% 20% 30% 40% 50% 60% 70% 19
  • 20. Text/Content Analytics 2011: User Perspectives The 215 respondents in 2011 chose a total of 962 textual-information sources, an average of 4.5 sources per respondent. The big news is not news at all: Social sources are by far the most popular and 4 of the top 5 categories are social/online (as opposed to in- enterprise) sources. Despite social’s status, however, it is a source for barely more than 6 out of 10 respondents. 20
  • 21. Text/Content Analytics 2011: User Perspectives Q4: Return on Investment Question 4 asked, “How do you measure ROI, Return on Investment? Have you achieved positive ROI yet?” There were 164 respondents. Results are charted from highest to lowest values of the sum of “currently measure” and “plan to measure”: How do you measure ROI, Return on Investment? Measure: Achieved Measure: Not Achieved Plan to Measure higher satisfaction ratings 19% 18% 28% increased sales to existing customers 13% 18% 29% ability to create new information products 11% 13% 27% improved new-customer acquisition 9% 15% 25% higher customer retention/lower churn 10% 12% 23% higher search ranking, Web traffic, or ad response 10% 12% 22% reduction in required staff/higher staff productivity 9% 9% 23% fewer issues reported and/or service complaints 9% 6% 23% lower average cost of sales, new & existing customers 5% 7% 23% faster processing of claims/requests/casework 10% 6% 19% more accurate processing of claims/requests/casework 6% 7% 20% 0% 10% 20% 30% 40% 50% 60% 70% Out of 164 respondents, 37.8% (62), report that they have achieved positive ROI according to some measure. Those 62 respondents reported achieving ROI according to a total of 182 measures, that is, 2.94 ROI-achieved measures for each respondent who achieved positive ROI. Out of 164 respondents, 50 are measuring ROI but have not yet achieved positive ROI according to any measure. The 112 respondents who are measuring ROI (whether achieved or not) track a total of 385 measures among them, 3.44 measures per respondent. The following are several of the Other responses given:  Better customer insight, market intelligence, and competitive intelligence.  Content findability.  Creation of scientific knowledge.  Higher employee engagement and better L&D outcomes.  Improvement in existing processes, turnover time. 21
  • 22. Text/Content Analytics 2011: User Perspectives  Incremental sales lift.  Lowered cost of fraud, more accurate predictive analytics.  Number of action executives can take, estimated dollar savings from risk correction/avoidance.  Patient outcomes.  Providing better data to scholars.  Reduction of Claim Cost.  Stronger understanding of subconscious emotional zones.  We don´t know how to measure it properly. Q5: Mindshare A word cloud, generated at Wordle.net, seemed a good way to present responses to the query, “Please enter the names of companies that you know provide text/content analytics functionality, separated by commas. List up to the first 8 that come to mind.” There were 129 responses, many offering several companies. A bit of data cleansing was done, to regularize names and remove inappropriate responses. Contrast with the 2009 word cloud (deliberately rendered smaller than the 2011 cloud, without an attempt to create sizing consistent between the two clouds) based on 48 response records, as follows: Note that IBM acquired SPSS in mid-2009. 22
  • 23. Text/Content Analytics 2011: User Perspectives Q6: Spending Question 6 asked about 2010 spending and 2011 expected spending. How much did your organization spend in 2010, and how much do you expect to spend in 2011, on text/content analytics software/service solutions? 90% 80% 7% 70% 3% 6% 6% 60% 2% 7% 4% 7% 7% 50% $1 million or above 9% 40% $500,000 to under $1 million 30% $200,000 to $499,999 30% $100,000 to $199,999 20% 23% $50,000 to $99,000 10% under $50,000 15% 19% use open source 0% 2010 spent (n=176) 2011 expected (n=165) $1 million or above 6% 7% $500,000 to under $1 million 2% 3% $200,000 to $499,999 4% 6% $100,000 to $199,999 7% 7% $50,000 to $99,000 9% 7% under $50,000 23% 30% use open source 15% 19% Questions asked of only current text/content-analytics users. Questions 8 through 13 were posed exclusively to current text/content analytics users, to the 81.2% of the 206 respondents to Q7: Are you currently using text/content analytics? Q8: Satisfaction Question 8 asked, “Please rate your overall experience – your satisfaction – with text analytics.” It offered five categories, listed here with response counts:  Overall experience/satisfaction (n=117, of whom 3 No experience/No opinion).  Ability to solve business problems (n=114, 12 NE/NO).  Solution/technology ease of use (n=112, 5 NE/NO).  Solution/technology performance (n=114, 4 NE/NO). 23
  • 24. Text/Content Analytics 2011: User Perspectives  Availability of professional services/support (n=112, 13 NE/NO). Responses, which across categories are somewhat anomalous, are as shown: Please rate your overall experience – your satisfaction – with text/content analytics 100% 3% 3% 4% 4% 4% 4% 7% 90% 13% 17% 21% 24% 80% 31% Very disappointed 70% Disappointed 36% Neutral 60% 36% 38% Satisfied 50% Completely satisfied 40% 58% 42% 30% 35% 31% 27% 20% 10% 17% 12% 12% 11% 9% 0% Overall, 70% of current-users respondents who had an opinion reported themselves Satisfied/Completely Satisfied even while the breakout-category counts totaled 59%, 36%, 47%, and 42% Satisfied/Completely Satisfied. We can surmise that the numbers who voiced “No experience/No opinion” for the breakout categories tended to have a favorable overall experience. 24
  • 25. Text/Content Analytics 2011: User Perspectives Experience/satisfaction sentiment polarity Positive Overall experience / satisfaction Neutral 80% Negative 60% Availability of 40% Ability to solve professional services / 20% business problems support 0% Solution / technology Solution / technology performance ease of use Q9: Overall Experience Question 9 asked, “Please describe your overall experience – your satisfaction – with text analytics.” The following are 49 from among the 63 responses, categorized, lightly edited for spelling and grammar and with the names of three products masked: Happy It works. Excellent. Absolutely essential. Very satisfied, most goals exceeded, big jump in effectiveness and customer satisfaction. Pretty happy given we are in a highly technical different to monitor/track niche. Saving a lot of time for our journalists. We have found having an application with the capabilities to clean and normalize the text and quantitative data, process it to a form to analyze, and run text mining and categorization on an ad hoc or production basis has greatly enhanced my team's capabilities and productivity. We found great value from using a Speech Analytics solution to retain customers and improve the overall customer experience through root-cause analysis. I have been working with text analytics for academic and scientific purposes and I am quite satisfied with results achieved. I work with nurse and social science researchers. They think that a chat with 20 people is research. I tend to analyze hundreds or thousands of free-text comments. 25
  • 26. Text/Content Analytics 2011: User Perspectives I use software to overcome the biases inherent in manual analysis. It Takes Work Very powerful tool but requires the organization's ability to take action on the insights. Valuable tool; my clients are content to underutilize it, so what is available more than meets our needs. Since we use open source, the ROI is basically how much time you put into the solution and how many problems it solves. We have been successful so far. Very Satisfied but extremely labor intensive We provide this as a tool to our clients in our application for publishing press releases. It works fine but could be better but that is up to us to implement it fully. Once you spend man hours to set up the tool, it is extremely consistent on doing what you tell it to do. I know improvements are coming but I'd like more AI from text analytics tools than what is currently offered. Do-It-Yourself is challenging but not impossible. Very cheap to operate. Fairly satisfied – problem is I am sole researcher and data/text clean-up takes too much time given other demands. I've been a user and vendor of text analytics (in fact, in my early <...> days, we helped coin the phrase “text analytics”). Vendors generally overpromise and have difficulty delivering. Both vendors and customers underestimate the amount of resources required to get it right. So, still hard to use for mainstream purposes. Reservations and complications Steep learning curve. I am currently satisfied, but I believe we (as analysts) are just beginning to fully unlock the full potential of text analytics. On one hand, I'm amazed and thrilled that this stuff exists at all. But on the other hand, I haven't seen anything that does just what I want it to do. It's opened up opportunities to analyze unstructured data but not at the same level as structured data. Works well at highest level of analysis (e.g. sentiment) but not as well in auto- coding for custom (i.e. project) studies. Tools are good, but lack transparency, ability to explain how conclusions are reached. There is still a lot of work required to optimize this technology since it can currently provide concepts but does not capture context and it’s a lot of slow painful work to get the software to recognize context in which something is mentioned and 26
  • 27. Text/Content Analytics 2011: User Perspectives accuracy is still not a lot. Unmet needs Very promising technology but some difficulties to - Implement smoothly text mining component into existing information system. - Cope with various languages, formats, volumes, etc. of data. - Measure and demonstrate tangible results in terms of improved information extraction quality. - Assess ROI (reducing processing time / saving resources for core tasks e.g. analysis). Powerful but overly difficult, impenetrable - technology vs. solutions. An emerging and enabling technology in our business with broad applicability. Satisfied in our applications with accuracy and precision but hitherto disappointed with export capability to other applications. Still a volatile market for applications beyond VOC/sentiment analysis. Vendors are eager to please but sometimes overstate the capabilities. However, I still have limited experience in solving real business problems with these tools (I am a consultant). I think this field is in its infancy. Lots of issues with data quality. Sentiment analytics often flawed. Hard to scale or automate. The handful of companies and solutions I came across do not seem to marry or integrate structured and unstructured text easily... Algorithms are not quite available as a function or way to improve accuracy. I feel there is so much more work to be done both on the analysis side and also on the business implementation side. While I work heavily in this area, I won't be more satisfied until I see better end-to-end integration and until I see more effective and systematic use of insights. I do everything myself. The lack of good lexical resources and taxonomies is a real problem that drives up the cost (in manpower) of providing a solution. And the complexity of the infrastructure required vs. the apparent simplicity of the problem (in managers' minds) makes it very difficult to adjust expectations. We use <...> and we have to write our own routines to find the text and content that we are interested in. There are plenty of functions that help us with our goals but obviously there is still much that we need to do to higher recall and accuracy. <...> is the only tool which is both open source and professionally useful. However in spite of 20 years of development, it still has a very poor user interface as well as API interface which hinder productivity and acceptance at a beginner's level. Skepticism Jury is still out. It’s still evolving, accuracy of results something to watch for in iterations. 27
  • 28. Text/Content Analytics 2011: User Perspectives Still learning. Very early days! Promising but still very difficult to see quick results. Everything seems to take ages and it’s been a painful learning curve. Hard to trust the automated results when you've been used to achieving 100% with manual human analysis. Still too new. Field as a whole is underperforming what is possible. Though the concept is very appealing, it is still in its native stages, and a lot more possibilities are left to be explored. IBM Watson is a good step ahead in that direction. Very poor, almost useless. Looking ahead On the whole, very satisfied with the range of solutions available and their ease of use. Very much looking forward to watching the technology progress – it's obviously not perfect yet. Unlike structured data, getting value out of text analytics tools require understanding of text elements – how to utilize occurrence of different parts of speech, how to interpret different types of sentences like requests, commands, opinionated sentences, etc. Domain knowledge and tunable and adaptable systems are a must for success. Non-availability of trained personnel to provide text mining services leads to dissatisfaction of users. Business end users do not like to use the tools themselves because of the complexity. The process or strategy for text mining needs to be established. We're pretty happy with text analytics and see it as a transformational technology. Most of text analytics' problems lie in how it is sold. It is both broad and deep and has a myriad of tools best suited for very different use cases, but customers think "text analytics is text analytics." Really, “text analytics” is a horrible term that needs to be broken up into component parts. Q10: Providers Question 10 asked, “Who is your provider? Enter one or more, separated by commas, most important provider first.” There were 77 response records, listing providers (sorted and without counts): Autonomy, Clarabridge, Colbenson, Content Analyst, Expert System, GATE, IBM, in-house, Lexalytics, LingPipe, Megaputer, MotiveQuest, open source , Open Text, R, Radian6, Rapid-I, Saplo, SAS, Smartlogic, Sysomos, TEMIS, Teradata, TextKernel, Thomson Reuters (including Calais, ClearForest), Verint, Zemanta Note that the survey asked, “Please respond only if you are a user, prospect, integrator, or 28
  • 29. Text/Content Analytics 2011: User Perspectives consultant.” Q11: Provider Selection Question 11 asked, “How did you identify and choose your provider? (If more than one, limit response to your most important provider.)” Applicability, robust performance, open source. Research. Experience and luck. Very satisfied reference customers with similar applications, most flexible solutions, expertise of consultants, high quality of service, extreme agility, and extremely rapid idea-to-deployment cycles. They contacted us before launch of their first product. Product evaluation in context of business application. Based on business requirements in the framework of a European competitive tender procedure. Advised by a related Web development consultant. We spent about a year evaluating and classifying vendors that in part or whole would fill our needs as expressed in Q9. We decided on using an application with integrated quantitative and qualitative analytic capabilities as the best possibilities. We ended up doing POC's with SAS, SPSS and Megaputer, and ended up choosing the later. We evaluated multiple providers based on (1) tool flexibility – can we customize? (2) accuracy (3) type of content it can tag (4) sentiment methodology (5) price. Main criteria are cost, multi-language capability, and integration with SAS. Competitive bids. Large existing analytics relationship: Tool was an add on. Conducted a thorough investigation of leading providers in the space. Quality and reviews. Personal recommendations. Constructed a needs analysis ranking system. Our needs included ease of integration, tools, ability to produce meaningful results at sub-document (short document) level, ease of (or no) training. Networking and academic partnerships. 29
  • 30. Text/Content Analytics 2011: User Perspectives Proof of Concept – evaluated about a dozen or so vendors – have not selected a TM vendor as yet. Recommended by a trusted source. Based on recommendations <...> and our own search / lab testing, which brought us to <...>. Introduction from my manager. What my client uses. It was an obvious choice since there was no real alternative on the market (i.e. language is limiting the products). We compared a number of providers and decided to go for <...> that have a local presence and are experts on the Swedish language. Free, for research purposes. Trials based on performance. Price/performance tradeoff and applicability to targeted business problem. I work for the company. Use other languages (Perl) as necessary. Tested various services, rated results. I choose the vendor or tools based upon my client application needs. We do not have a primary provider ... we maintain a library of tools and use many of them in the same project. Trying all major ones. Reputation, personal contacts. Established open-source project. Market research, pricing, case studies and product evaluations. It was recommended to us. Working in-house. I worked for one of them and selected the other on their open source commitment. Proof of Concept. Support for Drupal. Cost, applicability to needs. 30
  • 31. Text/Content Analytics 2011: User Perspectives I don't, that's up to my clients. But my advice to them is to begin with an understanding of the goals, and work backward to identify the provider. Company demo. Recommendation from experts, and tried and tested different ones. Management mandate, client. RFP to replace an existing legacy system. We already used <...> and had everything we needed to do Proof of Concept; waiting for business reason to acquire <...>. Advanced, scalable LSI technology. Work for them. Ability to mine audio, text, and customer surveys. Q13: Promoter? Question 13 is new with the 2011 survey; we did not ask it in 2009. It is a basic net- promoter type question, without the “net” part: “How likely are you to recommend your most important provider to others who are looking for a text/content analytics solution?” Of 87 responses, 49% were positive, 23% were neutral, and 28% were negative. How likely are you to recommend your most important provider? Extremely likely to recommend against 15% Moderately likely to recommend against 34% Slightly likely to recommend 6% against Neither likely to recommend 7% nor recommend against Slightly likely to recommend Moderately likely to 10% 23% recommend 5% Extremely likely to recommend Promoters outweigh detractors by a net of 21. 31