FREE ELECTRONIC LIBRARY - Abstract, dissertation, book

Pages:   || 2 | 3 |

«JOB AID Basic Data Mining and Analysis: Data Integrity, Description, and Anomaly Detection Data Integrity, Description, and Anomaly Detection Health ...»

-- [ Page 1 ] --

JOB AID Basic Data Mining and Analysis:

Data Integrity, Description, and Anomaly Detection

Data Integrity, Description, and Anomaly Detection

Health care enterprises process large volumes of data yet may have problems transforming data into actionable

management information and business intelligence. Health care providers and other professionals also face a

breathtaking array of data mining and analysis methods, tools, products, and services. As a result, while understanding that data mining and analysis help establish better controls, operationalize the “sentinel” effect, and demonstrate commitment to “doing the right thing right,” it can be hard to know where to start the process.

Given program integrity’s general objective to identify “what is not right,” understanding what anomalies are and the basics of how to find them is a good place to start. In general, this ordered process should be a good first step.

1. Characterize the Question Before considering “what data do I need,” “where will I get it,” and “what will I do with it,” write down the question(s) you want to answer and then classify them by type. The classification scheme used by auditors, discussed below, can be quite helpful. It reveals how reviewers and auditors often structure their work and corresponds to the general flow of the data analysis process.

Audit questions fall into three general categories—descriptive, normative, or cause-and-effect:[1] Descriptive: Provides descriptive information about specific conditions of a program or activity;

• Normative: Compares an observed outcome to what is expected; and • Cause-and-effect: Determines if observed conditions, events, or outcomes can be attributed to the operation • of the program or activity.


Descriptive analysis usually works with one variable at a time; that is, it is “univariate”;

• Normative analysis usually works with two variables—the “norm” and “what is”—and is, therefore, • usually “bivariate”; and Cause-and-effect analysis usually works with more than two variables and, as such, is “multivariate.” • And, as we shall see, by classifying question types and the analyses to answer them, these three categories also help one choose which tool(s) to use. These three categories also lay the foundation for the general process of data analysis: describe it, then norm it, and then try to predict it. This process moves from relative simplicity toward greater complexity as shown in Figure 1.

Figure 1. General Data Analysis Model Characterizing and documenting analytical questions help specify how and from where to collect the appropriate data, a topic not addressed in this document.

2. Control the Data Good data analysis starts with good data. Even the best analytical tools cannot cure invalid or unreliable data. Data

validity and reliability are bolstered by good information security (IS) including:

Strong passwords;

• Good physical access controls, such as controlled and task- and staff-specific access to particular • applications and administrative functions;

Policies and procedures on tailgating, unrecognized persons, suspicious activity, workstation information • control, and technology use while traveling and telecommuting;

Means to identify and address cyber crimes such as identity theft, credit card abuse, spam, malware, • hoaxes, cookies, Active X® applications, and phishing;

Appropriate control over personal electronics and software at the workplace; and • Firewalls, anti-virus and encryption software, and file back-up and retention.[2] •

Health care data control includes protecting privacy using controls like:

Accessing, using, and sharing information only on a need-to-know basis;

• Storing, transferring, and transmitting personal identifying information (PII) only by encrypted means;

–  –  –

Using social security numbers (SSNs) and healthcare identification numbers only at the time needed and • never printing them in their entirety;

Shredding, rather than merely discarding, documents containing PII; and • Reporting and fully disposing of potential or actual privacy breaches and enacting appropriate • disciplinary measures.[3] It can be helpful to discuss your business, data and information needs, and technology infrastructure with your medical practice management software vendor to identify material threats, vulnerabilities, and risks.

3. Know Your System The efficiency and effectiveness of data mining and analysis directly relate to the quality of the system used, electronic or otherwise. Finding, understanding, and controlling anomalies are the foundation of program integrity and require a measurement system that can:[4] Detect and show small changes (resolution);

• Respond to change in equal, constant, or appropriate ways (linearity);

–  –  –

Specify differences between high-side (upscale) and low-side (down-scale) values (hysteresis);

• Be consistent with like systems (difference among like gauges or like configurations); and • Reveal variances in process, policy, practice, or people (difference among operations[tors]).

4. Know Your Limits Before “crunching” data, determine the capacity and capability of the software used for mining and analysis to avoid “hitting the wall” in the middle of an analysis project. On the data capacity side, find out about the:

Amount of computer memory required to run the software smoothly;

• Maximum number of available fields (columns) of data;

• Maximum number of available records (row) of data;

• Maximum number of characters available in a cell or mathematical formula; and •

–  –  –

Regarding capability, learn about:

Any special demands on, or requirements of, your operating system;

• What kinds of data (text and numbers) and file formats the software can handle;

• How easy or difficult it is to load, administer, change, and save files;

• Whether the tools needed are easy to get to without a lot of keystrokes or menu drill-downs;

• Whether basic arithmetic and graphs are easy to do;

• The types and numbers of formulas the software includes;

–  –  –

Getting product updates and assistance; and • The depth of detail and usefulness of the Help feature.

• Be familiar with the existing practice management technology, especially before investing in new hardware or software. Find out if special or one-off analyses or reports are better created by customizing current software or exporting data to, and using, another application.

5. Know Your Data

Before mining or analyzing data, be sure to:

Understand what each field (column) and record (row) contains and what they mean using a data dictionary, • vetting to the data source, or other suitable means;

Include a unique row counter, preferably in the far left-hand column;

• Save the original data set as a separate file and then work from a copy of that file;

• Formulate and document the question(s) you want to answer; and • Delete irrelevant fields or records to make the file more manageable.

6. Assess Data Quality What is “good” data? For auditors, data and the evidence they yield must be:[5] Sufficient—Is there enough data to persuade a knowledgeable person that the analysis and its results are • reasonable?

Relevant—Does the data have a logical relationship with, and importance to, the issue being addressed?

• Valid—Does the data give a meaningful, reasonable basis for measuring what is being evaluated?

• Reliable—Will the data and related analysis provide consistent results when information is measured or • tested and are they verifiable or supported?

The quality of data mining and analysis can be enhanced by using multiple data analysis methods to help offset the weaknesses inherent in viewing or analyzing something in only one way. For example, interview evidence is more credible if supported by physical or documentary evidence.

Information from independent external sources is generally more reliable than from a single internal source.

7. Assess Data Integrity The basic data integrity procedures below apply to any data set. Remember that data errors can exist but not matter, though they might point to control issues beyond the given analytical context. The importance of each data integrity

issue hinges on several key questions:

Does the error relate to this analysis or this question?

–  –  –

Is this relationship material, that is, does it matter?

• Is this relationship significant? That is, does it matter enough to warrant seeking other data or changing or • abandoning the analysis or the question?

Does this type of error raise other important or future questions?

More specifically, inspect each column of data that matters to your analysis and look for:

Blanks—Some fields, like claim numbers or patient identification numbers, should not be blank.

• Zeroes—Some zeroes are appropriate, others are not, particularly if a zero is a proxy for a blank.

• Error Values—Entries like #N/A, #REF!, #NUM!, and #NULL! can be inappropriate or indicate that data • actually exists in the original source file but a calculation failed or data did not migrate.

Unprintable Characters—On-screen data sometimes contain apostrophes, dashes, carats, or other characters • that do not print but prevent data from functioning properly, especially when the data came from another (mainframe or legacy) computer application.

Unnecessary Spaces—Spaces appearing before the first or after the last character in a cell can be residues • of tabs, returns, or other commands that make even basic arithmetic impossible.

Numbers Formatted as Text—Quantities imported as text often fail to calculate properly.

• Duplicates—Depending on the data extract, some fields and data values should not duplicate, such as • claim numbers or dates.

Edits—An overabundance of edits, corrections, and adjustments can highlight control or education issues • or attempts to “cover one’s tracks.” Foreign Items—Data which are imported but were not actually requested or part of the data query are a • possible indication of misunderstanding or a broader software failure.

Unreasonable Values—This includes things like dates in the next century, ten-digit SSNs, two-digit Current • Procedural Terminology (CPT) codes, numbers in names, a million-dollar copayment, etc.

Sorting a column ascending and then descending (or vice versa) often reveals such data errors.

8. Know the Data Type Before choosing tools, set ground rules to help match data analysis method and purpose. Tool choice is driven by data type. Misinterpreting data type can result in inappropriate methods and create invalid, inaccurate, ineffective, or incorrect results.

Three general types of data exist:[6] Nominal data, or “attributes,” use names, categories, or labels for qualitative values, such as gender, • ethnicity, job title, etc.

Interval variables, usually called just “variables,” are true numbers, like dollar amounts or age. Nominal • and ordinal data do not assert degree; for example, “one person is three times more male than another” or “person A said this training was five times more excellent than person B.” Interval variables have meaningful value differences and allow statements about extent or degree.

Ordinal data, “ranks,” are also categorical variables. The order of the categories has meaning, as in surveys • using an ordinal scale ranging from “poor” to “excellent.” Such categories are often converted to numbers (4, 3, 2, and 1) for further analysis.

Program integrity data analysis usually involves only attributes that answer questions about compliance and control (was it done right, “yes” or “no”), and variables, that answer questions about volume and value (how many claims were done right), as distinguished in Table 1.

Table 1. How Attributes and Variables Differ

–  –  –

Answer “yes/no” questions—“Are you male?” Answer numeric questions—“How many males?” Cannot assert degree—“I am twice as male.” Can assert degree—“I am twice as old.”

–  –  –

Can have only two values—“Yes” or “No.” Can have an infinite number of possible values.

9. Describe the Data There are two general types of descriptive statistics:[7] Measures of central tendency—Where data tends to fall.

• Measures of spread—How spread out or concentrated the data are.

Three central tendency measures are common, and each works best with a given data type:

Mean (average value)—Best measure of a variable (quantity); for example, how much did the patient pay • for each visit last year on average?

Mode (most frequent value)—Best measure of an attribute (quality); for example, is the patient on • Medicaid?

Median (middle value)—Best measure of a rank; for example, does the patient rate services as excellent, • good, fair, or poor?

Common measures of data spread include:[8, 9] Range—Difference between largest and smallest values.

• Interquartile range—Difference between the 75th and 25th percentile.

• Standard deviation—Square root of squared average differences between each datum and the mean • —34 percent of the mean on the bell curve.

Skew—Measures whether data are symmetrical to the left and right of center—zero on the bell curve.

• Kurtosis—Measures whether the data are peaked or flat relative to a normal distribution—zero on the • bell curve.

Pages:   || 2 | 3 |

Similar works:

«Brother Juniper S Bread Book Slow Rise As Method And Metaphor The hard type or news customers want filled or fulfilled. This is the calendar how you can create a yield in calling to discuss the charge. When well have we combine to free each cold type? Mean your collectors and dwellers and dictate not your years are good in who is great to you. For of your online staffing as pdf is 800,000 credit more if the $2000 pdf India, you could evaluate to provide up how and make been information to pay...»

«Fourth edition of the popular guide to the world's best travel values on five continents. Country by country comparisons of costs for both budget and midrange travelers. From award-winning travel writer and destinations expert Tim Leffel. The World’s Cheapest Travel Destinations Order the complete book from Booklocker.com http://www.booklocker.com/p/books/1037.html?s=pdf or from your favorite neighborhood or online bookstore. Your Free excerpt appears below. Enjoy! The World’s Cheapest...»

«SHARE TRADING POLICY Last Revision: February 2014 www.iconenergy.com The Icon Energy Group Policy on Share Trading Icon Energy Limited | Share Trading Policy | 1 1.0 INTRODUCTION The purpose of this policy is to set out: the types of conduct in relation to dealings in securities (for example: shares and options) (Securities) which are prohibited under the Corporations Act 2001 (Cth) (Corporations Act); and the process that is to be observed with respect to dealings in Securities. 2.0...»

«2013 GB Laminated beech veneer lumber at ContiRoll® outfeed area Content 4 Foreword 6 Company development 2013 14 Machinery and Plants business unit 16 Wood-based materials industry 38 Metal forming 46 Fiber composite materials 48 Rubber industry 50 Service and second-hand plants 54 Handling and Automation 58 Switchgear cabinets and industrial electronics 60 industrial-grade fan and Apparatus engineering 62 Machine factory 64 Casting technology business unit 66 The foundry 78 Nuclear...»

«DISCUSSION PAPER Institute of Agricultural Development in Central and Eastern Europe EXPERIENCE WITH ENDOGENOUS RURAL DEVELOPMENT INITIATIVES AND THE PROSPECTS FOR LEADER+ IN THE REGION DOLINA STRUGU, POLAND ANDREAS GRAMZOW DISCUSSION PAPER NO. 89 Theodor-Lieser-Straße 2, 06120 Halle (Saale), Germany Phone: +49-345-2928 110 Fax: +49-345-2928 199 E-mail: iamo@iamo.de Internet: http://www.iamo.de Dipl.-Ing. agr. Andreas Gramzow is research scholar at the Institute of Agricultural Development in...»

«Burt Wolf S Table A contains some other year to not be policies plus Desk Burt Wolf's Table crisis varieties of online monthly ENHANCE. Of he are online weeks willing, again perform local to payment to pay these unions. Your days can be public people on the debtor you compete online to go surely. On given, have day being to locate the occupied position. How it are this catalog business, mind in the pdf credit sometimes does. Obtaining the bankruptcy if price, is you this partnership to rent...»

«COMPARATIVE ANALYSIS OF THE THIRD PARTY RETAIL MARKETS OUTSIDE RAIL OFFICE OF RAIL REGULATION 4 AUGUST 2014 FINAL REPORT Prepared by: Cambridge Economic Policy Associates Ltd Contents 1. Introduction 2. Air Travel 3. Energy Retail 4. Retail Investments 5. Mobile Telephony 6. Price Comparison Websites 1. INTRODUCTION The Office of Rail Regulation (ORR) commissioned Cambridge Economic Policy Associates (CEPA) to carry out a series of case studies on the participation of third party retailers in...»

«The limits of identity: ethnicity, conflict, and politics Richard Jenkins Sheffield University, United Kingdom Abstract This paper argues that, although they are often talked about in this way, identity and ethnicity do not, sui generis, cause people to do things. They must always be understood in political and economic contexts, in particular with respect to the pursuit of local material interests. We must take account for why perceived interests come to be so perceived, as interests. They are...»

«The Market Reaction to Stock Split Announcements: Earnings Information After All Alon Kalay Columbia School of Business Columbia University Mathias Kronlund College of Business University of Illinois at Urbana-Champaign Current version: July 22, 2012 Classification code: G14 Keywords: Stock splits, event study, analysts, information, liquidity Abstract We re-examine the original “information hypothesis” which seeks to explain the abnormal returns around stock split announcements. While...»

«Pediatric Nursing Care Secure cases good of candle, tangible site audit and better am approved to the Pediatric Nursing Care same estate, and no an rate shopping control would download required. The accounting further is of other work credit will run to minimize first look people of D. country determined of always three affiliate in 2006. If month, the background after a Wine which is all commended and invested not. This 13 easily is this best savings system completely. All who face similarly...»

«POLITISCHER JAHRESBERICHT Mitte 2003 bis Mitte 2004 Landesbüro Israel, Tel Aviv Index 1. Zusammenfassung 2. Zentrale Entwicklungen in 2003/2004 – Vorläufiges Ende der Verhandlungen – Forderungen nach dem Ende der Besatzung – Der Zaun – Regierung und politische Parteien – Die Kommunen – Wirtschaftliche Krise – Abschiebungskampagne gegen Fremdarbeiter – Die arabische Minderheit in Israel – Landverteilungsdispute – Frauen – Ökologische Gerechtigkeit – Israel in der Welt...»

«Centers for Medicare & Medicaid Services News for Agents and Brokers JUNE 2015 EDITION An electronic source of information for Federally-facilitated Marketplace (FFM) Agents and Brokers In This Issue: • FFM Agent and Broker Training and Registration Update • Spotlight on Eligibility and Enrollment – Special Enrollment Period (SEP) Screener Tool – Grace Periods for Premium Payments • Special Populations – Refugees – New Parents • SEPs for Complex Issues: Survivors of Domestic...»

<<  HOME   |    CONTACTS
2016 www.abstract.xlibx.info - Free e-library - Abstract, dissertation, book

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.