(不定期更新)使用VBA解決 excel web 查詢無法匯入、匯入太慢的股市資料

請教snare大神
有聽說goodinfo擋IP變嚴了
我已經間隔一分鐘有了
但我有疑問的點是 被擋後 我用proxy
我網頁可以正常瀏覽

網址為
https://goodinfo.tw/StockInfo/ShowSaleMonChart.asp?STOCK_ID=3005

但HTMLsourcecode.body.innerhtml會出現
您的瀏覽量異常, 已影響網站速度, 目前暫時關閉服務, 請稍後再重新使用
若您是使用程式大量下載本網站資料, 請適當調降程式查詢頻率, 以維護一般使用者的權益

請問您有什麼想法或是知道該如何解決嗎?

謝謝
rainbowsperm wrote:
有聽說goodinfo擋IP變嚴了
我已經間隔一分鐘有了
但我有疑問的點是 被擋後 我用proxy
我網頁可以正常瀏覽

...
...恕刪…


請問您有什麼想法或是知道該如何解決嗎?

謝謝


解決方式很簡單
改用 CreateObject("WinHttp.WinHttpRequest.5.1") ,設定proxy就行
程式碼只有多一行.SetProxy

以前在這2樓就有簡單說明一下
2017-11-12 268樓
2020-08-05 866樓







改用proxy不難,以goodinfo為例,當檢查到出現,瀏覽量異常的關鍵字時,換ip就行
麻煩在穩定、快速、免費的proxy server不好找

如果您測試時,不小心找到穩定、快速、免費的proxy
希望您能整理好分享給各位



'以下使用proxy的簡易範例

'測試方式
'一、用ie、chrome、…等等,開網頁,先手動多次按F5,讓網站觸發擋ip





'二、確定無法查詢後,用WinHttp.WinHttpRequest.5.1
'改用proxy,這時就可發現因為換了ip,網站就正常了







Sub goodinfo_proxy_test()

Dim Xmlhttp As Object, HTMLsourcecode As Object, Url As String, ttt As Double

Set HTMLsourcecode = CreateObject("htmlfile")
Set Xmlhttp = CreateObject("WinHttp.WinHttpRequest.5.1")

ttt = Timer

Url = "https://goodinfo.tw/StockInfo/ShowSaleMonChart.asp?STOCK_ID=3005"

With Xmlhttp
.Open "GET", Url, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.setRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"


'以下幾個proxy是google隨機選來測試的,可用但非常不穩定,時好時壞
'請自行搜尋可用的proxy ip
'(proxy格式,"ip:prot" 或 "網址:prot")

'.SetProxy 2, "139.224.56.17:8080"
'.SetProxy 2, "39.109.123.188:3128"
'.SetProxy 2, "183.181.10.71:3128"
'.SetProxy 2, "161.35.5.35:8080"
'.SetProxy 2, "40.85.152.26:8080"
'.SetProxy 2, "165.225.77.46:9443" '<= 速度快,但被goodinfo檔掉了,可能有很多人在用這個proxy



'當proxy不穩、連不上時,程式碼會中斷,停在.send
'或是傳回proxy錯誤的responsetext,而不是查詢的網站
'所以如果想讓程式碼有自動切換功能
'需另外加上on error做錯誤處理、換ip

'proxy ip 可放在陣列、工作表、文字檔…用for next 廻圈輪流換)


.Send

HTMLsourcecode.body.innerhtml = convertraw(.ResponseBody)

If InStr(HTMLsourcecode.body.innerhtml, "瀏覽量異常") > 0 Then
MsgBox HTMLsourcecode.body.innertext
Exit Sub
End If

'debug
MsgBox HTMLsourcecode.body.innertext, , "proxy test"

End With


Set HTMLsourcecode = Nothing
Set Xmlhttp = Nothing

Debug.Print Timer - ttt

End Sub



Function convertraw(rawdata)

Dim rawstr
Set rawstr = CreateObject("adodb.stream")

With rawstr
.Type = 1
.Mode = 3
.Open
.Write rawdata
.Position = 0
.Type = 2
.Charset = "utf-8"
convertraw = .ReadText
.Close
End With

Set rawstr = Nothing

End Function

謝謝大神的教學, 又學到了一招
不過我proxy是當被擋後用VPN軟體手動切換的
如果之後改用Excel發現有好用的proxy再跟大家分享

謝謝
snare大神
想了解為何使用"msxml2.xmlhttp"的方式
EXCEL會被擋, 但瀏覽器不會?
改用WinHttp的方式EXCEL就不會被擋?
不知是何原因, 之前xmlhttp都用得好好的, 現在才出問題~~XD

不過現在改用WinHttp卻遇到亂碼的問題
有試著研究adodb.stream但還是試不出所以然
所以只好繼續請教大神

網址一樣
https://goodinfo.tw/StockInfo/ShowSaleMonChart.asp?STOCK_ID=3005
但結果會變下面這樣


再麻煩大神了
rainbowsperm wrote:
想了解為何使用"msxml2.xmlhttp"的方式
EXCEL會被擋, 但瀏覽器不會?
改用WinHttp的方式EXCEL就不會被擋?
不知是何原因, 之前xmlhttp都用得好好的, 現在才出問題~~XD


程式沒寫好吧,我試正常





rainbowsperm wrote:
不過現在改用WinHttp卻遇到亂碼的問題
有試著研究adodb.stream但還是試不出所以然



992樓回答您的範例就是解答,或參考其它有用adodb.stream範例

謝謝snare大神 果真如大神所說
不好意思 在一次檢查發現了錯誤
發現改成winhttp後沒有將.responsetext改成.responsebody
一切都正常了~~XD
不好意思浪費您的時間 下次會再更仔細一點
樓主您好,
小弟也發生與972樓的狀況一樣,會出現無法抓到TABLE,而是JSON混雜的資料,依您的說明,改以CreateObject("WinHttp.WinHttpRequest.5.1"),確實可以抓到部分股票資料,但仍會出現找不到TABLE而中斷
另會在"Do Until .readyState = 4: DoEvents: Loop"出現"執行階段錯誤 '438'
先前yahoo股市有公告要改為新版(如下圖),不知是否有影響到程式的運作?還是Windows 10要重灌?





部分程式碼:
DoEvents
Set HTMLsourcecode = CreateObject("htmlfile")
'Set GetXml = CreateObject("msxml2.xmlhttp")
Set GetXml = CreateObject("WinHttp.WinHttpRequest.5.1")
'Set GetXml = CreateObject("Msxml2.XMLHTTP.3.0")
URL = "https://tw.stock.yahoo.com/q/q?s=" & stockNo

With GetXml
.Open "GET", URL, False
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
'Do Until .readyState = 4: DoEvents: Loop
HTMLsourcecode.body.innerhtml = .responsetext
Set Table = HTMLsourcecode.all.tags("table")(2).Rows
activer wrote:
小弟也發生與972樓的狀況一樣,會出現無法抓到TABLE,而是JSON混雜的資料,依您的說明,改以CreateObject("WinHttp.WinHttpRequest.5.1"),確實可以抓到部分股票資料,但仍會出現找不到TABLE而中斷
另會在"Do Until .readyState = 4: DoEvents: Loop"出現"執行階段錯誤 '438'
先前yahoo股市有公告要改為新版(如下圖),不知是否有影響到程式的運作?還是Windows 10要重灌?


今天再次測試msxml2.xmlhttp,目前確認到是yahoo改版造成的問題
看來是這幾天是改版過渡期,才出現時好時壞的狀況

(新)https://tw.stock.yahoo.com/q/ts?s=
(舊)https://tw.stock.yahoo.com/q/q?s=

2個網址都需改用WinHttp.WinHttpRequest.5.1

抓不到table時,是innerhtml錯誤,但innertext正常
需增加程式碼除錯、重新下載html

有空我再修正1樓(舊)、21樓(舊),再加寫新版範例,請稍等…
因yahoo網頁改版,無法使用msxml2.xmlhttp正常下載,且出現隨機innerhtml錯誤情形

==================
(2021/8/4更新)
以下各範例程式碼多加一行,指定user-agent
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"

目前確認,多筆下載,有時會擋ip,有時不會
但因yahoo似乎還在改版中,不確定擋ip會不會常駐
建議可另找資料來源
==================

修正範例如下(測試股票代號,2002中鋼):

一、舊版網址 https://tw.stock.yahoo.com/q/q?s= ……


Sub Yahoo_Old()

Dim i As Integer, j As Integer, re As Integer, URL As String, Html As Object, GetXml As Object, ttt As Double

ActiveSheet.Cells.Clear
re = 0
ttt = Timer
URL = "https://tw.stock.yahoo.com/q/q?s=2002" '舊

retry:
Set Html = CreateObject("htmlfile")
Set GetXml = CreateObject("WinHttp.WinHttpRequest.5.1")


With GetXml
.Open "GET", URL, False
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"

.send

Html.body.innerhtml = .responsetext
Debug.Print Html.all.tags("table").Length

If Html.all.tags("table").Length = 0 Then
re = re + 1
Debug.Print "retry:" & re
If re > 3 Then
ActiveSheet.Cells(1, 1) = "下載失敗"
Exit Sub
End If

Delaytick (0.5)
Set Html = Nothing
Set GetXml = Nothing
GoTo retry
End If


Set Table = Html.all.tags("table")(2).Rows

For i = 0 To Table.Length - 1
For j = 0 To Table(i).Cells.Length - 1
ActiveSheet.Cells(i + 1, j + 1) = Table(i).Cells(j).innertext
Next j
Next i

ActiveSheet.Columns.AutoFit

End With

Debug.Print Timer - ttt

Set Html = Nothing
Set GetXml = Nothing

End Sub




二、新版網址 https://tw.stock.yahoo.com/q/ts?s= …… (表格)


Sub Yahoo_New()

Dim i As Integer, j As Integer, re As Integer, URL As String, Html As Object, GetXml As Object, ttt As Double

ActiveSheet.Cells.Clear

re = 0
ttt = Timer
URL = "https://tw.stock.yahoo.com/q/ts?s=2002" '新

retry:
Set Html = CreateObject("htmlfile")
Set GetXml = CreateObject("WinHttp.WinHttpRequest.5.1")


With GetXml
.Open "GET", URL, False
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"

.send

Html.body.innerhtml = .responsetext
Debug.Print Html.all.tags("table").Length

If Html.all.tags("table").Length = 0 Then
re = re + 1
Debug.Print "retry:" & re
If re > 3 Then
ActiveSheet.Cells(1, 1) = "下載失敗"
Exit Sub
End If

Delaytick (0.5)
Set Html = Nothing
Set GetXml = Nothing
GoTo retry
End If

Set Table = Html.all.tags("table")(3).Rows
ActiveSheet.Range("a1") = Split(Html.all.tags("table")(1).innertext, " ")(0)

For i = 0 To Table.Length - 1
For j = 0 To Table(i).Cells.Length - 1
ActiveSheet.Cells(i + 2, j + 1) = Table(i).Cells(j).innertext
Next j
Next i

ActiveSheet.Columns.AutoFit

End With

Debug.Print Timer - ttt

Set Html = Nothing
Set GetXml = Nothing

End Sub


三、新版網址 https://tw.stock.yahoo.com/q/ts?s= …… (剪貼薄)



Sub Yahoo_New_ClipBoard()

Dim i As Integer, j As Integer, re As Integer, URL As String, Html As Object, GetXml As Object, Clipboard As Object, ttt As Double

ActiveSheet.Cells.Clear

re = 0
ttt = Timer
URL = "https://tw.stock.yahoo.com/q/ts?s=2002" '新

retry:
Set Html = CreateObject("htmlfile")
Set GetXml = CreateObject("WinHttp.WinHttpRequest.5.1")
Set Clipboard = CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")



With GetXml
.Open "GET", URL, False
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"

.send

Html.body.innerhtml = .responsetext
Debug.Print Html.all.tags("table").Length

If Html.all.tags("table").Length = 0 Then
re = re + 1
Debug.Print "retry:" & re
If re > 3 Then
ActiveSheet.Cells(1, 1) = "下載失敗"
Exit Sub
End If

Delaytick (0.5)
Set Html = Nothing
Set GetXml = Nothing
Set Clipboard = Nothing
GoTo retry
End If

Set Table = Html.getelementbyid("detailTable")
Clipboard.SetText Table.innerhtml
Clipboard.PutInClipboard

With ActiveSheet
.Cells(1, 1).Select
.PasteSpecial NoHTMLFormatting:=True
.Cells(1, 1).Select
.Columns.AutoFit
End With

End With

Debug.Print Timer - ttt

Set Html = Nothing
Set GetXml = Nothing

End Sub



'延遲副程式,三個範例都要加上
Sub Delaytick(setdelay As Single)

Dim StartTime As Double, NowTime As Double
StartTime = Timer * 100
setdelay = setdelay * 100
Do
NowTime = Timer * 100
DoEvents
Loop Until NowTime - StartTime > setdelay

End Sub

2016年時是用舊版excel,效率較差
所以當初所寫的1樓yahoo下載範例,是用DoEvents,用副程式每次下載5筆
現在新版excel沒這個問題,本想趁這次yahoo改版,改成一次一筆,程式碼比較短
但測試後發現,速度是變快了,但反而容易下載失敗
所以這次還是維持分拆資料的做法

==================
(2021/8/4更新)
程式碼可多加一行(也可不加),指定user-agent
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"

目前確認,多筆下載,有時會擋ip,有時不會
但因yahoo似乎還在改版中,不確定擋ip會不會常駐
建議另找資料來源
==================



如果想改成一次一筆測試,把fake_Multiplex()副程式,用下面程式碼取代
Sub test()

Dim ttt As Double, lastrow As Integer
ttt = Timer
lastrow = Sheets("stock").Range("a1").CurrentRegion.Rows.Count
Sheets("stock").Range("b2:l" & lastrow).Clear: Range("n1:n3") = ""
ReDim TempArray(lastrow - 2, 10)
Call getstock(2, lastrow)
Sheets("stock").Range("b2:l" & lastrow).Value = TempArray()
Erase TempArray()
Debug.Print Timer - ttt

End Sub




新版程式碼修正如下(維持分拆資料的做法)
有加入自動重新下載(1次)功能,如果想增加手動下載部份股票,請參考999樓範例

Global TempArray(), DownloadError As Integer
Sub fake_Multiplex()

Dim i As Integer, j As Integer, lastrow As Integer, Firstdata As Integer, Lastdata As Integer, t As Double, ErrorStock As Integer
t = Timer
DownloadError = 0
lastrow = Sheets("stock").Range("a1").CurrentRegion.Rows.Count
Sheets("stock").Range("b2:l" & lastrow).Clear: Sheets("stock").Range("n1:n3") = ""

ReDim TempArray(lastrow - 2, 10)
If lastrow Mod 5 > 0 Then j = Int(lastrow / 5) + 1 Else j = Int(lastrow / 5)

For i = 1 To j
DoEvents
If i = 1 Then Firstdata = 2 Else Firstdata = (i - 1) * 5 + 1
If i = j Then
Lastdata = lastrow
Else
Lastdata = (i - 1) * 5 + 5
Sheets("stock").Range("n1") = "Loading " & Round((i / j) * 100) & "%"
End If
Call getstock(Firstdata, Lastdata)
Next i

With Sheets("stock")
If DownloadError > 0 Then Call Redownload(lastrow)
.Range("b2:l" & lastrow).Value = TempArray()
If DownloadError > 0 Then .Range("n2") = DownloadError & " 下載失敗"
.Range("n1") = lastrow - 1 - DownloadError & " stock loading ok"
.Cells.EntireColumn.AutoFit
End With


Erase TempArray()
Debug.Print Timer - t


End Sub

Sub Redownload(lastrow As Integer)

Dim i As Integer

For i = 5 To 0 Step -1
Delaytick (1)
Sheets("stock").Range("n3") = DownloadError & "筆失敗=>" & i & "秒後,重新下載"
Next i

DownloadError = 0
Sheets("stock").Range("n3") = ""

For i = 2 To lastrow
If TempArray(i - 2, 1) = "下載失敗" Then
Call getstock(i, i)
End If
Next i

End Sub


Sub getstock(Firstdata As Integer, Lastdata As Integer)

Dim URL As String, Html As Object, GetXml As Object, re As Integer, i As Integer, j As Integer, k As Integer

re = 0

For k = Firstdata To Lastdata

DoEvents
URL = "https://tw.stock.yahoo.com/q/q?s=" & Sheets("stock").Cells(k, 1)

retry:
Set Html = CreateObject("htmlfile")
Set GetXml = CreateObject("WinHttp.WinHttpRequest.5.1")

With GetXml
.Open "GET", URL, False
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send

Html.body.innerhtml = .responsetext
Debug.Print Html.all.tags("table").Length

If Html.all.tags("table").Length = 0 Then
re = re + 1
Debug.Print "retry:" & re
If re > 5 Then
DownloadError = DownloadError + 1
Debug.Print "TotalError:" & DownloadError
Debug.Print Sheets("stock").Cells(k, 1) & "下載失敗"
Set Html = Nothing
Set GetXml = Nothing
TempArray(k - 2, 1) = "下載失敗"
Delaytick (0.5)
GoTo Nextstock
End If

Delaytick (0.5)
Set Html = Nothing
Set GetXml = Nothing
GoTo retry
End If


Set Table = Html.all.tags("table")(2).Rows


For i = 1 To Table.Length - 1
For j = 0 To Table(i).Cells.Length - 2

If i = 1 And j = 0 Then
TempArray(k - 2, j) = Mid(Split(Table(i).Cells(j).innertext, Chr(13) & Chr(10))(0), 5, Len(Split(Table(i).Cells(j).innertext, Chr(13) & Chr(10))(0)))
Else
TempArray(k - 2, j) = Trim(Table(i).Cells(j).innertext)
If InStr(TempArray(k - 2, j), "▽") > 0 Or InStr(TempArray(k - 2, j), "▼") > 0 Then Sheets("stock").Cells(i + (k - 1), j + 2).Font.Color = -11489280
If InStr(TempArray(k - 2, j), "△") > 0 Or InStr(TempArray(k - 2, j), "▲") > 0 Then Sheets("stock").Cells(i + (k - 1), j + 2).Font.Color = -16776961
End If

Next j
Next i

End With

Nextstock:
Set Html = Nothing
Set GetXml = Nothing


Next k


End Sub


Sub Delaytick(setdelay As Single)

Dim StartTime As Double, NowTime As Double
StartTime = Timer * 100
setdelay = setdelay * 100
Do
NowTime = Timer * 100
DoEvents
Loop Until NowTime - StartTime > setdelay

End Sub



想測試下載失敗的
一、停用 or 啟用 這一行 If DownloadError > 0 Then Call Redownload(lastrow)
二、 If re > 5 Then …,把5 改成0

(如果a欄有自訂股票代號,請先備份)

因網頁改版,此範例失效,請參考1095樓範例,配合1364樓,nijawang提出的修改方式。

[點擊下載]
關閉廣告
文章分享
評分
評分
複製連結
請輸入您要前往的頁數(1 ~ 143)

今日熱門文章 網友點擊推薦!