用于抓取数据的VBA脚本不起作用

最大限度

我编写了一个简短的VBA脚本,该脚本可生成URL并下载页面内容并放入新的工作表中。但是,数据始终显示在两页上,从而产生以下类型的URL:

对于结果的第一页:

resultat_annuaire.php?loc=01&item=hopital&session=clear   (with 01 being the region) 

对于第二页:

resultat_annuaire.php?loc=01&item=hopital&page=2   (session=clear is gone, replaced by page=2) 

当我的VBA脚本生成和擦伤网址的第一页,它工作正常(即我得到95个不同的网页下载到我的Excel)

但是,当我运行相同的VBA脚本(仅更改生成第二页的URL的方式)时,它下载的内容是第一个URL第2页内容的95倍。

现在,我尝试通过执行以下操作在Web浏览器中简单地摆弄URL:

输入第二个页面URL:

resultat_annuaire.php?loc=01&item=hopital&page=2

然后像这样将01更改为05:

resultat_annuaire.php?loc=05&item=hopital&page=2

再说一次,什么都没有发生,页面保持不变,即好像我没有将01切换到05。

这是VBA脚本:

Sub Data_scraping()
    For x = 1 To 9
        ActiveWorkbook.Worksheets.Add
        With ActiveSheet.QueryTables.Add(Connection:= _
            "URL;" _
            & "http://etablissements.hopital.fr/resultat_annuaire.php?loc=" _
            & "0" _
            & x _
            & "&item=hopital&session=clear" _
            , Destination:=Range("$A$1"))


        '.CommandType = 0


        .Name = "resultat_annuaire.php?loc=01&item=hopital&session=clear"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlEntirePage
        .WebFormatting = xlWebFormattingNone
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With
    ActiveWindow.SmallScroll Down:=18
    Rows("1:31").Select
    Selection.Delete Shift:=xlUp
    Range("A5").Select
Next x
End Sub

有人可以提供解释或帮助吗?

雅库

您的宏非常适合page = 2

Sub sof20287920Data_scrapping()
  Dim x, strLoc, strUrl
  Dim wkb

  Set wkb = Workbooks.Add()
  wkb.Activate

  For x = 1 To 9
    ActiveWorkbook.Worksheets.Add After:=ActiveWorkbook.Sheets(ActiveWorkbook.Sheets.Count)
    strLoc = "resultat_annuaire.php?loc=" & "0" & x & "&item=hopital&session=clear&page=2"
    strUrl = "http://etablissements.hopital.fr/" & strLoc
    With ActiveSheet.QueryTables.Add(Connection:= _
      "URL;" & strUrl _
      , Destination:=Range("$A$1"))


      '.CommandType = 0


      .Name = strLoc
      .FieldNames = True
      .RowNumbers = False
      .FillAdjacentFormulas = False
      .PreserveFormatting = True
      .RefreshOnFileOpen = False
      .BackgroundQuery = True
      .RefreshStyle = xlInsertDeleteCells
      .SavePassword = False
      .SaveData = True
      .AdjustColumnWidth = True
      .RefreshPeriod = 0
      .WebSelectionType = xlEntirePage
      .WebFormatting = xlWebFormattingNone
      .WebPreFormattedTextToColumns = True
      .WebConsecutiveDelimitersAsOne = True
      .WebSingleBlockTextImport = False
      .WebDisableDateRecognition = False
      .WebDisableRedirections = False
      .Refresh BackgroundQuery:=False
    End With

    ActiveWindow.SmallScroll Down:=18
    Rows("1:31").Select
    Selection.Delete Shift:=xlUp
    Range("A5").Select
  Next x

End Sub

即使对于page = 2,session = clear似乎也是必需的,如下所示:

http://etablissements.hopital.fr/resultat_annuaire.php?loc=01&item=hopital&session=clear&page=2

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章