python - Pandas DataFrame slicing based on logical conditions? -
i have dataframe called data:
subjects professor studentid 8 chemistry jane 999 1 chemistry jane 3455 0 chemistry joseph 1234 2 history jane 3455 6 history smith 323 7 history smith 999 3 mathematics doe 56767 10 mathematics einstein 3455 5 physics einstein 2834 9 physics smith 323 4 physics smith 999
i want run query "professors @ least 2 classes 2 or more of same students". desired output
smith: physics, history, 323, 999
i familiar sql , have done easily, still beginner in python. how achieve output in python? line of thought convert dataframe sql database , have sql interface through python run queries. there way accomplish that?
students_and_subjects = df.groupby( ['professor', 'subjects'] ).studentid.nunique().ge(2) \ .groupby(level='professor').sum().ge(2) df[df.professor.map(students_and_subjects)]
Comments
Post a Comment