Before /home/pythonscripts/mpxdatacheck/LINELIST_PAHO2024_01_30_04_52_59.csv After /home/pythonscripts/mpxdatacheck/LINELIST_PAHO2024_01_31_04_52_55.csv --------------------------- icu --------------------------- Answers changed: -> Before NO 99.439552 YES 0.533760 UNK 0.026688 Name: icu, dtype: float64 -> After NO 90.734202 UNK 8.778765 YES 0.487033 Name: icu, dtype: float64 --------------------------- health_worker --------------------------- Answers changed: -> Before NO 94.601902 YES 5.321498 9 0.058577 UNK 0.009012 0 0.009012 Name: health_worker, dtype: float64 -> After NO 93.882753 YES 5.281045 UNK 0.769128 9 0.058132 0 0.008943 Name: health_worker, dtype: float64 DataComPy Comparison -------------------- DataFrame Summary ----------------- DataFrame Columns Rows 0 df1 34 60549 1 df2 34 60553 Column Summary -------------- Number of columns in common: 34 Number of columns in df1 but not in df2: 0 Number of columns in df2 but not in df1: 0 Row Summary ----------- Matched on: index Any duplicates on match values: No Absolute Tolerance: 0 Relative Tolerance: 0 Number of rows in common: 60,549 Number of rows in df1 but not in df2: 0 Number of rows in df2 but not in df1: 4 Number of rows with some compared columns unequal: 826 Number of rows with all compared columns equal: 59,723 Column Comparison ----------------- Number of columns compared with some values unequal: 8 Number of columns compared with all values equal: 26 Total number of values which compare unequal: 1,935 Columns with Unequal Values or Types ------------------------------------ Column df1 dtype df2 dtype # Unequal Max Diff # Null Diff 1 gender object object 1 0 1 5 health_worker object object 170 0 170 4 hiv_status object object 158 0 158 2 hospitalised object object 17 0 0 3 icu object object 719 0 719 7 immunosuppresssion object object 140 0 139 0 pregnant object object 4 0 4 6 transmission object object 726 0 726 Sample Rows with Unequal Values ------------------------------- pregnant (df1) pregnant (df2) recordid reporting_country QC410474 CANADA NaN UNK QC402718 CANADA NaN UNK BCmpx_148 CANADA NaN UNK QC396690 CANADA NaN UNK gender (df1) gender (df2) recordid reporting_country QC409318 CANADA NaN UNK hospitalised (df1) hospitalised (df2) recordid reporting_country QC392452 CANADA YUNK UNK QC393406 CANADA YUNK UNK BCmpx_148 CANADA YUNK UNK QC402718 CANADA YUNK UNK QC392459 CANADA YUNK UNK QC395855 CANADA YUNK UNK QC394059 CANADA YUNK UNK QC397438 CANADA YUNK UNK QC392456 CANADA YUNK UNK QC400493 CANADA YUNK UNK icu (df1) icu (df2) recordid reporting_country ON1267110 CANADA NaN UNK ON1267446 CANADA NaN UNK ON1264113 CANADA NaN UNK ON1266189 CANADA NaN UNK ON1274355 CANADA NaN UNK ON1276710 CANADA NaN UNK ON1271336 CANADA NaN UNK ON1267459 CANADA NaN UNK ON1264736 CANADA NaN UNK ON1271572 CANADA NaN UNK hiv_status (df1) hiv_status (df2) recordid reporting_country ON1264147 CANADA NaN UNK ON1263578 CANADA NaN UNK ON1267219 CANADA NaN UNK ON1268222 CANADA NaN UNK ON1272420 CANADA NaN UNK ON1264745 CANADA NaN UNK ON1272516 CANADA NaN UNK ON1264658 CANADA NaN UNK BCmpx_176 CANADA NaN UNK BCmpx_027 CANADA NaN UNK health_worker (df1) health_worker (df2) recordid reporting_country ON1271351 CANADA NaN UNK ON1271574 CANADA NaN UNK ON1268426 CANADA NaN UNK ON1266941 CANADA NaN UNK ON1272444 CANADA NaN UNK ON1264658 CANADA NaN UNK ON1271001 CANADA NaN UNK BCmpx_202 CANADA NaN UNK ON1272410 CANADA NaN UNK ON1269166 CANADA NaN UNK transmission (df1) transmission (df2) recordid reporting_country ON1265511 CANADA NaN UNK ON1274847 CANADA NaN UNK ON1268238 CANADA NaN UNK ON1266026 CANADA NaN UNK ON1270303 CANADA NaN UNK ON1273464 CANADA NaN UNK ON1269593 CANADA NaN UNK ON1271529 CANADA NaN UNK ON1270082 CANADA NaN UNK ON1272415 CANADA NaN UNK immunosuppresssion (df1) immunosuppresssion (df2) recordid reporting_country ON1267868 CANADA NaN UNK QC392470 CANADA NaN UNK ON1280579 CANADA NaN UNK ON1271351 CANADA NaN UNK QC397438 CANADA NaN UNK ON1268494 CANADA NaN UNK QC400493 CANADA NaN UNK ON1275654 CANADA NaN UNK ON1267606 CANADA NaN UNK ON1274721 CANADA NaN UNK Sample Rows Only in df2 (First 10 Columns) ------------------------------------------ pregnant case_class smallpox_vaccine gender sexual_orientation clade hospitalised concurrrent_sti icu outcome recordid reporting_country QC530872 CANADA NaN CONFIRMED NaN MALE NaN NaN NO NaN NaN NaN QC529619 CANADA NaN CONFIRMED NaN MALE NaN NaN NO NaN NaN NaN QC528769 CANADA NaN CONFIRMED NaN MALE NaN NaN NO NaN NaN NaN QC532750 CANADA NaN PROBABLE NaN MALE NaN NaN YUNK NaN NaN NaN