2

I have a dataframe with a column multi-index, df1, with a datetime index and 2 levels: level 0, called Capitals, has columns A, B, C, and level 1, called Smalls, has columns a, b, c, d, e.

Capitals A B C
Smalls a b c d e a b c d e a b c d e
Date
01-01-25
01-02-25
01-03-25
01-04-25

I have a second dataframe, df2, with the same datetime index and three columns, X, Y and Z.

X Y Z
Date
01-01-25
01-02-25
01-03-25
01-04-25

Is there a way to:

i) multiply B of df1 by Z of df2 (Ba * Z, Bb * Z, Bc * Z, Bd * Z Be * Z) and

ii) add the 5 new (Smalls: a, b, c, d ,e) columns to a new Capitals column called D in df1?

Capitals A B C D
Smalls a b c d e a b c d e a b c d e a b c d e
Date
01-01-25
01-02-25
01-03-25
01-04-25

The method I'm using first creates an empty multi-index data frame with a similar structure to df1, with the new columns I want to add, which is connected to the original multi-index data frame.

Then it iterates through the level 1 values of B, multiplying by the value of Z in the second df.

# Extract level 1 tickers from df1.columns
smalls = df1.columns.get_level_values(1).unique()

# Create new MultiIndex for the empty columns
new_columns = pd.MultiIndex.from_product(['D', smalls],names=df1.columns.names)

# Create an empty DataFrame with the new columns
empty_df = pd.DataFrame(0, index=df1.index, columns=new_columns)

# Concatenate with the original DataFrame
df1 = pd.concat([df1, empty_df], axis=1)

# Multiply dfs and populate D
for small in smalls:
    df1[('D', small)] = df1[('B', small)] / df2['Z']

Is there a more streamlined way to do this, using vectors rather than iterating?

2 Answers 2

2

You can create the new D columns by multiplying the B columns in df1 by the Z column in df2 then concatenating them by concat. The column name can be renamed using MultiIndex.from_product.

B_columns = df1.loc[:, 'B']
D_columns = B_columns.mul(df2['Z'], axis=0)

# Rename the top level from 'B' to 'D'
smalls = df1.columns.get_level_values(1).unique()
D_columns.columns = pd.MultiIndex.from_product([['D'], smalls], names=df1.columns.names)

df = pd.concat([df1, D_columns], axis=1)

Using two testing dataframes, df1:

Capitals    A                   B                   C
Smalls      a   b   c   d   e   a   b   c   d   e   a   b   c   d   e
Date                                                            
01-01-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-02-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-03-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-04-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5

and df2:

             X   Y   Z
Date            
01-01-25    10  11  12
01-02-25    10  11  12
01-03-25    10  11  12
01-04-25    10  11  12

This code returns:

Capitals    A                   B                   C                    D
Smalls      a   b   c   d   e   a   b   c   d   e   a   b   c   d   e    a   b   c   d   e
Date                                                                                
01-01-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-02-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-03-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-04-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
Sign up to request clarification or add additional context in comments.

Comments

1

You can use concat for add D level to B and another concat or DataFrame.join for append to original:

np.random.seed(2)

mux = pd.MultiIndex.from_product([list('ABC'), list('abcde')])
df1 = pd.DataFrame(np.random.randint(10, size=(4, 15)), 
                   index=pd.date_range('01-01-25','01-04-25'), columns=mux)
print (df1)
            A              B              C            
            a  b  c  d  e  a  b  c  d  e  a  b  c  d  e
2025-01-01  8  8  6  2  8  7  2  1  5  4  4  5  7  3  6
2025-01-02  4  3  7  6  1  3  5  8  4  6  3  9  2  0  4
2025-01-03  2  4  1  7  8  2  9  8  7  1  6  8  5  9  9
2025-01-04  9  3  0  0  2  8  8  2  9  6  5  6  6  6  3

df2 = pd.DataFrame(np.random.randint(10, size=(4, 3)),
                   index=pd.date_range('01-01-25','01-04-25'),
                   columns=list('XYZ'))
print (df2)
            X  Y  Z
2025-01-01  8  2  1
2025-01-02  4  8  1
2025-01-03  6  9  5
2025-01-04  1  2  4

df3 = pd.concat({'D': df1.loc[:, 'B'].mul(df2['X'], axis=0)}, axis=1)
df = pd.concat([df1, df3], axis=1)

Or:

df3 = pd.concat({'D': df1.loc[:, 'B'].mul(df2['X'], axis=0)}, axis=1)


df = df1.join(df3)
print (df)
            A              B              C               D                
            a  b  c  d  e  a  b  c  d  e  a  b  c  d  e   a   b   c   d   e
2025-01-01  8  8  6  2  8  7  2  1  5  4  4  5  7  3  6   7   2   1   5   4
2025-01-02  4  3  7  6  1  3  5  8  4  6  3  9  2  0  4   3   5   8   4   6
2025-01-03  2  4  1  7  8  2  9  8  7  1  6  8  5  9  9  10  45  40  35   5
2025-01-04  9  3  0  0  2  8  8  2  9  6  5  6  6  6  3  32  32   8  36  24

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.