I have a dataframe with a column multi-index, df1, with a datetime index and 2 levels: level 0, called Capitals, has columns A, B, C, and level 1, called Smalls, has columns a, b, c, d, e.
| Capitals | A | B | C | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Smalls | a | b | c | d | e | a | b | c | d | e | a | b | c | d | e |
| Date | |||||||||||||||
| 01-01-25 | |||||||||||||||
| 01-02-25 | |||||||||||||||
| 01-03-25 | |||||||||||||||
| 01-04-25 |
I have a second dataframe, df2, with the same datetime index and three columns, X, Y and Z.
| X | Y | Z | |
|---|---|---|---|
| Date | |||
| 01-01-25 | |||
| 01-02-25 | |||
| 01-03-25 | |||
| 01-04-25 |
Is there a way to:
i) multiply B of df1 by Z of df2 (Ba * Z, Bb * Z, Bc * Z, Bd * Z Be * Z) and
ii) add the 5 new (Smalls: a, b, c, d ,e) columns to a new Capitals column called D in df1?
| Capitals | A | B | C | D | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Smalls | a | b | c | d | e | a | b | c | d | e | a | b | c | d | e | a | b | c | d | e |
| Date | ||||||||||||||||||||
| 01-01-25 | ||||||||||||||||||||
| 01-02-25 | ||||||||||||||||||||
| 01-03-25 | ||||||||||||||||||||
| 01-04-25 |
The method I'm using first creates an empty multi-index data frame with a similar structure to df1, with the new columns I want to add, which is connected to the original multi-index data frame.
Then it iterates through the level 1 values of B, multiplying by the value of Z in the second df.
# Extract level 1 tickers from df1.columns
smalls = df1.columns.get_level_values(1).unique()
# Create new MultiIndex for the empty columns
new_columns = pd.MultiIndex.from_product(['D', smalls],names=df1.columns.names)
# Create an empty DataFrame with the new columns
empty_df = pd.DataFrame(0, index=df1.index, columns=new_columns)
# Concatenate with the original DataFrame
df1 = pd.concat([df1, empty_df], axis=1)
# Multiply dfs and populate D
for small in smalls:
df1[('D', small)] = df1[('B', small)] / df2['Z']
Is there a more streamlined way to do this, using vectors rather than iterating?