Datura
MemberForum Replies Created
-
(3)Analysis of Variance (ANOVA) Test
t-test 还有很多类型,one-way t-test, two-way t-test, paired t-test 等等,其区别以后再说。t-test 只能适用于比较2组数据 ( continuous) ,如果是多变量多组数据,就有些无能为力了。这时, 功能强大的 ANOVA 同学就闪亮登场。
比如上面的化学反应,在 6 个不同反应温度下的进行试验,最后得到了6 组不同温度 下的反应产率。该如何判别这些数据是否有区别呢?Just follow the same way.
H0: u1 = u2 = u3 =…..= u6
H1: u1, u2….u6 not all equal.
把实验数据带入 Excel, SAS, 进行 ANOVA 计算,最后如果:
1. p > alpha, then we can not reject H0. 就是说,这许多组数据,可能并没有显 著差别,说不定是误差所致。
2. p < alpha, then we can reject H0. 就是说,这6组数据,有显著差别,其中,至 少有一组与其他组是不同的。
ANOVA 可以用来评估几乎所有的数据比较,回归分析,功能强大。<wbr>同时注意:ANOVA has both one-way and two-way ANOVA.
In digital marketing, Student’s t-Test or Chi-Squared Test 被称为 A/B testing. 实际上,它们就是这里介绍的假说检验 hypothesis testing. 不要披上马甲就不认识了噢!
-
The X axis is same but Y axis is different.
In cumulative gains chart, we want to present the gains of target capturing of using predictive model against random selection. However, for the cumulative distributions of Good and Bad classes, we intend to see the separation power of models.
Did I answer your questions?
-
Generally, it is a work process. We need to understand it in context.
-
Credit score of zero or negative is theoretically possible because we can choose PDO or offset to scale a predicted probability to any wanted score range
But actually, it’s always more convenient for us to set score range between 100 —1000. In this case, zero or a negative score are not real scores they are just special value codes, meaning no score or sth else. We need exclude these rows from our analysis
To create the decile we divide it into approximately equal size bins based on the score. In Python we can use qcut() function to do it. In SAS use Proc Rank
-
Wow, it works, awesome! you are a genius. Thank you! We need to use the locals() /globals() macro symbol tables, which are similar to those in SAS.
df1=pd.DataFrame({})
df2=pd.DataFrame({})
df3=pd.DataFrame({})
df4=pd.DataFrame({})
df5=pd.DataFrame({})
df6=pd.DataFrame({})dflist=[df1,df2,df3, df4, df5, df6]
for i in range(len(dflist)):
print("loop round: ", i)
locals()['df'+str(i+1)] = raw[i]
df6- This reply was modified 4 years, 2 months ago by Datura.
-
I tried this method, it run through without errors, but all the data frames df1-df3 are still all empty. The loop does not overwrite the pre-defined empty ones.
df1=pd.DataFrame()
df2=pd.DataFrame()
df3=pd.DataFrame()
dflist=[df1,df2,df3]
for i in range(len(dflist)):
print("loop round: ", i)
temp=raw[i]
print(temp)
dflist[i] = temp
df1So, what’s wrong?
- This reply was modified 4 years, 2 months ago by Datura.
-
I checked the urlwatch Python package, please see below, the fundamental idea is same as mine. Please see below…… we can just use this package since it is available and free.
———————————————-
Introduction
urlwatch monitors the output of webpages or arbitrary shell commands.
Every time you run urlwatch, it:
- retrieves the output and processes it
- compares it with the version retrieved the previous time (“diffing”)
- if it finds any differences, generates a summary “report” that can be displayed or sent via one or more methods, such as email