core

Monkey patches for pandas.

Utils

 dummydf ()

A dummy DataFrame.

 DataFrame.repetitions (col)

Counts the number of repetitions for each element.

df = pd.DataFrame({'a': [1, 2, 3, 4, 4, 5, 5, 6, 6, 6], 'b':[1, 1, 1, 1, 2, 2, 2, 3, 3, 4]})
df.repetitions('b')

b
1    4
2    3
3    2
4    1
dtype: int64

test(df.repetitions('b'), pd.Series({1:4, 2:3,3:2, 4:1}), all_equal)

 DataFrame.repetition_counts (col)

Counts the number of groups with the same number of repetitions.

In the following example there are three groups with one element, two groups with two elements, and one group with three elements.

df.repetition_counts('a')

1    3
2    2
3    1
dtype: int64

test(df.repetition_counts('a'), pd.Series({1: 3, 2:2, 3:1}), all_equal)

 DataFrame.single_events (col)

 DataFrame.single_events (col)

Returns rows that appear only once.

df.single_events('a')

test_eq(df.single_events('a'), df.loc[[0, 1, 2]])

Pandas functions that are easier to execute as DataFrame/Series methods.

 DataFrame.crosstab (index, column, **kwargs)

 DataFrame.len ()

 Series.len ()

These methods allow fast exploration of the data in one line.

 Index.l ()

 Series.minmax ()

 DataFrame.page (page, page_size=5)

Shows rows between page*page_size and (page+1)*page_size

df = pd.DataFrame({'a': range(12), 'b': range(12)})
df.page(3)

	a	b
10	10	10
11	11	11

 Series.page (page, page_size=5)

Shows rows between page*page_size and (page+1)*page_size

s = pd.Series(range(15))
s.page(2)

5    5
6    6
7    7
8    8
9    9
dtype: int64

 L.page (page, page_size=10)

Shows elements between page*page_size and (page+1)*page_size

These methods are slight variations from DataFrame ones.

 DataFrame.renamec (d, *args, **kwargs)

df = dummydf()
df.renamec({'col_1': 'col_a'}, 'col_2', 'bar')

 Series.notin (values)

 Series.mapk (fun, **kwargs)

 DataFrame.sort (by, **kwargs)

temp = df.sample(df.len())
test_eq(temp.sort('col_1'), df)

 DataFrame.c2back (cols2back)

 DataFrame.c2front (cols2front)

df = dummydf()

df.c2back(['col_1'])

df.c2back('col_1')

df.c2front('col_2')

df.c2front(['col_2'])

 DataFrame.reorderc (to_front=[], to_back=[])

Reorder DataFrame columns.

df['col_3'] = df['col_1']
df.reorderc(['col_3'], ['col_1'])