投稿问答最小化  关闭

万维书刊APP下载

常用的计量经济学Python & Stata命令对比汇总

2023/1/3 17:45:31  阅读:149 发布者:

来源:综合整理自:Stata to Python Equivalents

http://www.danielmsullivan.com/pages/tutorial_stata_to_python.html

1、数据输入输出

简介

Stata

Python

log工作日志

log using

Python doesn't display results automatically like Stata. You have to explicitly call the print function. Using a Jupyter notebook is the closest equivalent.

帮助文件

help

help() OR? in IPython (as in pd.read_stata?)

定义修改工作路径。

cd some/other/directory

import os``os.chdir('some/other/directory')but this is bad practice. Better practice is to use full pathnames whenever possible.

使用导入数据。

use my_file

import pandas as pd``df = pd.read_stata('my_file.dta')

读取数据

use var1 var2 using my_file

df = pd.read_stata('my_file.dta', columns=['var1', 'var2'])

导入excel表格数据。

import excel using

df = pd.read_excel('')

导入csv格式数据。

import delimited using my_file.csv

df = pd.read_csv('my_file.csv')

保存数据

save my_file, replace

df.to_stata('my_file.dta') ORdf.to_pickle('my_file.pkl') for Python-native file type.

输出数据

outsheet using my_file.csv, comma

df.to_csv('my_file.csv')

导出数据

export excel using

df.to_excel('')

2、数据管理

简介

Stata

Python

保留

keep if

df = df[]

保留变量a大于7

keep if a > 7

df = df[df['a'] > 7]

删除

drop if

df = df[~()] where ~ is the logical negation operator in pandas and numpy (and bitwise negation for Python more generally).


keep if _n == 1

df.first() ORdf.iloc[0, :] Python is a 0-indexed language, so when counting the elements of lists and arrays, you start with 0 instead of 1.


keep if _n == _N

df = df.last() ORdf = df.iloc[-1, :]


keep if _n == 7

df = df.iloc[6, :] (Remember to count from 0)


keep if _n <= 10

df = df.iloc[:9, :] (Remember to count from 0)

保留变量

keep var

df = df['var']

保留变量var1 var2

keep var1 var2

df = df[['var1', 'var2']]

保留变量varstem开头的

keep varstem*

df = df.filter(like='varstem')

删除变量var

drop var

del df['var'] ORdf = df.drop('var', axis=1)

删除变量var1 var2

drop var1 var2

df = df.drop(['var1', 'var2'], axis=1)

删除变量varstem开头的

drop varstem*

df = df.drop(df.filter(like='varstem*').columns, axis=1)

3、数据统计分析

简介

Stata

Python

描述

describe

df.info() OR df.dtypes just to get data types. Note that Python does not have value labels like Stata does.

描述

describe var

df['var'].dtype

计数

count

df.shape[0] ORlen(df). Here df.shape returns a tuple with the length and width of the DataFrame.


count if

df[].shape[0] OR().sum() if the condition involves a DataFrame, e.g., (df['age'] > 2).sum()

对变量var进行summ

summ var

df['var'].describe()


summ var if

df[]['var'].describe() ORdf.loc[, 'var'].describe()


summ var [aw = ]

Right now you have to calculate weighted summary stats manually. There are also some tools available in the Statsmodels package.


summ var, d

df['var'].describe() plus df['var'].quantile([.1, .25, .5, .75, .9]) or whatever other statistics you want.

列联表分析var

tab var

df['var'].value_counts()

4、面板数据

简介

Stata

Python

面板数据设定

tsset panelvar timevar

df = df.set_index(['panelvar', 'timevar'])

滞后一期

L.var

df['var'].shift() NOTE: The index must be correctly sorted for shift to work the way you want it to. You will also probably need to use a groupby; see below.

滞后2

L2.var

df['var'].shift(2)


F.var

df['var'].shift(-1)

5、计量经济学模型操作命令

Stata

Python

ttest var1, by(var2)

from scipy.stats import ttest_ind``ttest_ind(array1, array2)

xi: i.var

pd.get_dummies(df['var'])

i.var2#c.var1

pd.get_dummies(df[var2]).multiply(df[var1])

reg yvar xvar if , r

import econtools.metrics as mt``results = mt.reg(df[], 'yvar', 'xvar', robust=True)

reg yvar xvar if , vce(cluster cluster_var)

results = mt.reg(df[], 'yvar', 'xvar', cluster='cluster_var')

areg yvar xvar1 xvar2, absorb(fe_var)

results = mt.reg(df, 'yvar', ['xvar1', 'xvar2'], fe_name='fe_var')

predict newvar, resid

newvar = results.resid

predict newvar, xb

newvar = results.yhat

_b[var]_se[var]

results.beta['var']results.se['var']

test var1 var2

results.Ftest(['var1', 'var2'])

test var1 var2, equal

results.Ftest(['var1', 'var2'], equal=True)

lincom var1 + var2

econtools.metrics.f_test with appropriate parameters.

ivreg2

econtools.metrics.ivreg

outreg2

econtools.outreg

reghdfe

None (hoping to add it to Econtools soon).

 

转自:“经管学苑”微信公众号

如有侵权,请联系本站删除!


  • 万维QQ投稿交流群    招募志愿者

    版权所有 Copyright@2009-2015豫ICP证合字09037080号

     纯自助论文投稿平台    E-mail:eshukan@163.com