Entity information life cycle for big data : (Record no. 22265)

MARC details
000 -LEADER
fixed length control field 08327cam a22003735a 4500
001 - CONTROL NUMBER
control field 18956824
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20160828160806.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 160201t2015 ne a frb 001 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9780128005378
040 ## - CATALOGING SOURCE
Original cataloging agency CDX
Language of cataloging eng
Transcribing agency CDX
Modifying agency OCLCQ
-- YDXCP
-- BTCTA
-- IAI
-- CHVBK
-- OCLCF
-- SINLB
-- AU@
-- OCLCQ
-- DLC
-- EG-ScBUE
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 025.04
Edition number 22
Item number TAL
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Talburt, John R.
9 (RLIN) 40947
245 10 - TITLE STATEMENT
Title Entity information life cycle for big data :
Remainder of title master data management and information integration /
Statement of responsibility, etc John R. Talburt, Yinle Zhou.
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc Waltham :
Name of publisher, distributor, etc Morgan Kaufmann / Elsevier,
Date of publication, distribution, etc c.2015.
300 ## - PHYSICAL DESCRIPTION
Extent xviii, 235 p. :
Other physical details ill. ;
Dimensions 24 cm.
500 ## - GENERAL NOTE
General note Index : p. 227-235.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc Bibliography : p. 219-225.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Machine generated contents note: ch. 1 The Value Proposition for MDM and Big Data -- Definition and Components of MDM -- Master Data as a Category of Data -- Master Data Management -- Entity Resolution -- Entity Identity Information Management -- The Business Case for MDM -- Customer Satisfaction and Entity-Based Data Integration -- Better Service -- Reducing the Cost of Poor Data Quality -- MDM as Part of Data Governance -- Better Security -- Measuring Success -- Dimensions of MDM -- Multi-Domain MDM -- Hierarchical MDM -- Multi-Channel MDM -- Multi-Cultural MDM -- The Challenge of Big Data -- What Is Big Data? -- The Value-Added Proposition of Big Data -- Challenges of Big Data -- MDM and Big Data -- The N-Squared Problem -- Concluding Remarks -- ch. 2 Entity Identity Information and the CSRUD Life Cycle Model -- Entities and Entity References -- The Unique Reference Assumption -- The Problem of Entity Reference Resolution -- The Fundamental Law of Entity Resolution.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: Internal vs. External View of Identity -- Managing Entity Identity Information -- Entity Identity Integrity -- The Need for Persistent Identifiers -- Entity Identity Information Life Cycle Management Models -- POSMAD Model -- The Loshin Model -- The CSRUD Model -- Concluding Remarks -- ch. 3 A Deep Dive into the Capture Phase -- An Overview of the Capture Phase -- Building the Foundation -- Understanding the Data -- Data Preparation -- Selecting Identity Attributes -- Attribute Uniqueness -- Attribute Entropy -- Attribute Weight -- Assessing ER Results -- Truth Sets -- Benchmarking -- Problem Sets -- The Intersection Matrix -- Measurements of ER Outcomes -- Talburt-Wang Index -- Other Proposed Measures -- Data Matching Strategies -- Attribute-Level Matching -- Reference-Level Matching -- Boolean Rules -- Scoring Rule -- Hybrid Rules -- Cluster-Level Matching -- Implementing the Capture Process -- Concluding Remarks.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: ch. 4 Store and Share -- Entity Identity Structures -- Entity Identity Information Management Strategies -- Bring-Your-Own-Identifier MDM -- Once-and-Done MDM -- Dedicated MDM Systems -- The Survivor Record Strategy -- Attribute-Based and Record-Based EIS -- ER Algorithms and EIS -- The Identity Knowledge Base -- Storing versus Sharing -- MDM Architectures -- External Reference Architecture -- Registry Architecture -- Reconciliation Engine -- Transaction Hub -- Concluding Remarks -- ch. 5 Update and Dispose Phases -- Ongoing Data Stewardship -- Data Stewardship -- The Automated Update Process -- Clerical Review Indicators -- Pair-Level Review Indicators -- Cluster-Level Review Indicators -- The Manual Update Process -- Asserted Resolution -- Correction Assertions -- Confirmation Assertions -- EIS Visualization Tools -- Assertion Management -- Search Mode -- Negative Resolution Review Mode -- Positive Resolution Review Mode.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: Managing Entity Identifiers -- The Problem of Association Information Latency -- Models for Identifier Change Management -- Concluding Remarks -- ch. 6 Resolve and Retrieve Phase -- Identity Resolution -- Identity Resolution -- Identity Resolution Access Modes -- Batch Identity Resolution -- Interactive Identity Resolution -- Identity Resolution API -- Confidence Scores -- Depth and Degree of Match -- Match Context -- Confidence Score Model -- Concluding Remarks -- ch. 7 Theoretical Foundations -- The Fellegi-Sunter Theory of Record Linkage -- The Context and Constraints of Record Linkage -- The Fellegi-Sunter Matching Rule -- The Fundamental Fellegi-Sunter Theorem -- Attribute Level Weights and the Scoring Rule -- Frequency-Based Weights and the Scoring Rule -- The Stanford Entity Resolution Framework -- Abstraction of Match and Merge Operations -- The Entity Resolution of a Set of References -- Consistent ER -- The R-Swoosh Algorithm.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: Entity Identity Information Management -- EIIM and Fellegi-Sunter -- EIIM and the SERF -- Concluding Remarks -- ch. 8 The Nuts and Bolts of Entity Resolution -- The ER Checklist -- Deterministic or Probabilistic? -- Calculating the Weights -- Cluster-to-Cluster Classification -- The Unique Reference Assumption and Transitive Closure -- Selecting an Appropriate Algorithm -- The One-Pass Algorithm -- Concluding Remarks -- ch. 9 Blocking -- Blocking -- Two Causes of Accuracy Loss -- Blocking as Prematching -- Blocking by Match Key -- Match Key and Match Rule Alignment -- The Problem of Similarity Functions -- Dynamic Blocking versus Preresolution Blocking -- Preresolution Blocking with Multiple Match Keys -- Blocking Precision and Recall -- Match Key Blocking for Boolean Rules -- Match Key Blocking for Scoring Rules -- Concluding Remarks -- ch. 10 CSRUD for Big Data -- Large-Scale ER for MDM -- Large-Scale ER with Single Match Key Blocking.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: The Transitive Closure Problem -- Distributed, Multiple-Index, Record-Based Resolution -- Transitive Closure as a Graph Problem -- References and Match Keys as a Graph -- An Iterative, Nonrecursive Algorithm for Transitive Closure -- Bootstrap Phase: Initial Closure by Match Key Values -- Iteration Phase: Successive Closure by Reference Identifier -- Deduplication Phase: Final Output of Components -- Example of Hadoop Implementation -- ER Using the Null Rule -- The Capture Phase and IKB -- The Identity Update Problem -- Persistent Entity Identifiers -- The Large Component and Big Entity Problems -- Postresolution Transitive Closure -- Incremental Transitive Closure -- The Big Entity Problem -- Identity Capture and Update for Attribute-Based Resolution -- Concluding Remarks -- ch. 11 ISO Data Quality Standards for Master Data -- Background -- Data Quality versus Information Quality -- Relevance to MDM -- Goals and Scope of the ISO 8000-110 Standard.
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Note continued: Unambiguous and Portable Data -- The Scope of ISO 8000-110 -- Motivational Example -- Four Major Components of the ISO 8000-110 Standard -- pt. 1 General Requirements -- pt. 2 Syntax of the Message -- pt. 3 Semantic Encoding -- pt. 4 Conformance to Data Specifications -- Simple and Strong Compliance with ISO 8000-110 -- ISO 22745 Industrial Systems and Integration -- Beyond ISO 8000-110 -- pt. 120 Provenance -- pt. 130 Accuracy -- pt. 140 Completeness.
520 ## - SUMMARY, ETC.
Summary, etc Entity Information Life Cycle for Big Data walks you through the ins and outs of managing entity information so you can successfully achieve master data management (MDM) in the era of big data. This book explains big data's impact on MDM and the critical role of entity information management system (EIMS) in successful MDM. Expert authors Dr. John R. Talburt and Dr. Yinle Zhou provide a thorough background in the principles of managing the entity information life cycle and provide practical tips and techniques for implementing an EIMS, strategies for exploiting distributed processing to handle big data for EIMS, and examples from real applications. Additional material on the theory of EIIM and methods for assessing and evaluating EIMS performance also make this book appropriate for use as a textbook in courses on entity and identity management, data management, customer relationship management (CRM), and related topics.
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Big data.
Source of heading or term BUEsh
9 (RLIN) 36502
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Semantic Web.
Source of heading or term BUEsh
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Pattern recognition systems.
Source of heading or term BUEsh
9 (RLIN) 4475
650 #7 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Data mining.
Source of heading or term BUEsh
9 (RLIN) 27695
651 ## - SUBJECT ADDED ENTRY--GEOGRAPHIC NAME
Source of heading or term BUEsh
653 ## - INDEX TERM--UNCONTROLLED
Resource For college Informatics and Computer Science
Arrived date list August 2016
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Zhou, Yinle,
Dates associated with a name 1986-
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
952 ## - LOCATION AND ITEM INFORMATION (KOHA)
-- 2016-08-28
Holdings
Withdrawn status Item status Source of classification or shelving scheme Damaged status Not for loan Vendor Home library Current library Shelving location Date acquired Source of acquisition Cost, normal purchase price Serial Enumeration / chronology Total Checkouts Total Renewals Full call number Barcode Date last seen Date last borrowed Cost, replacement price Koha item type
    Dewey Decimal Classification     Baccah Central Library Central Library Lower Floor 28/08/2016 Purchase 617.00 25137 1 2 025.04 TAL 000033414 11/06/2024 01/10/2019 771.25 Book - Borrowing