地方厚生局のコード内容別医療機関一覧表から医療機関の診療科を取得する

 先の記事では全国の医療機関の医療機関コードを取得した.今回は全国の医療機関の診療科を取得し,第一正規形にするところまでを解説する.

 今回から生成AIをコーディングの補助として使用することをお断りしておく.

データの取得

 データの取得は地方厚生局から始める.各地方厚生局のサイトの検査欄に「コード内容別医療機関一覧表」と入力して出てきた一番目に該当ページがある.ファイルの置き場所,構造は8つの地方厚生局それぞれによって異なる.下図は九州厚生局のサイトである.エクセルデータをダウンロードする.

九州厚生局のコード内容別医療機関一覧表
九州厚生局のコード内容別医療機関一覧表

Power Queryで地方厚生局ごとにデータを集約する

 九州の8県のファイルを厚生局ごと,医科と歯科ごとにフォルダに分ける.まとめてPower Queryで処理したくなるが,エラーにより全国一括読み込みは不可能である.地方厚生局ごとに読み込むのがせいぜいである.特に東北厚生局のファイルはブック内に複数のシートが県別に分かれているため,一枚のワークシートにコピペする必要がある.

「データの取得」「ファイルから」「フォルダーから」と進む
「データの取得」「ファイルから」「フォルダーから」と進む
読み込みたいファイルの直上のフォルダーを指定する
読み込みたいファイルの直上のフォルダーを指定する
「データの変換」をクリックする
「データの変換」をクリックする
Content列以外は削除する
Content列以外は削除する
Content列の右肩のボタンを押下して展開する
Content列の右肩のボタンを押下して展開する
「ファイルの結合」でSheet1を選択してOK
「ファイルの結合」でSheet1を選択してOK
ここまで自動的に読み込まれる
ここまで自動的に読み込まれる
46469行読み込まれた
46469行読み込まれた

VBAでの処理

 Power Queryで処理したかったが試行錯誤の末,諦めた.この時点で各地方厚生局ごと,医科ごと歯科ごとに一つのブックが出来上がっていることと思う.医科をまとめる用,歯科をまとめる用の新規ブックでVBEを起動して標準モジュールを挿入し,下記コードを実行する.I列にある診療科名を取得するのに苦労した.

Option Explicit

Sub CollectAllDataRecursive()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myFSO       As Object
    Dim myFolder    As Object
    Dim myWsDst     As Worksheet
    Dim myResult()  As Variant
    Dim j           As Long
    Dim myPath      As String
    Dim myMaxRows   As Long
    Dim myStartTime As Double
    
    ' ========================================
    ' 初期化
    ' ========================================
    myStartTime = Timer
    
    Set myFSO = CreateObject("Scripting.FileSystemObject")
    
    ' 最上位フォルダを選択
    With Application.FileDialog(msoFileDialogFolderPicker)
        .Title = "データファイルが格納されている最上位フォルダを選択"
        If .Show = -1 Then
            myPath = .SelectedItems(1)
        Else
            Exit Sub
        End If
    End With
    
    Set myFolder = myFSO.GetFolder(myPath)
    Set myWsDst = ThisWorkbook.Worksheets("Sheet1")
    myWsDst.Cells.Clear
    
    ' 結果格納用配列の初期サイズ
    myMaxRows = 500000
    ReDim myResult(1 To myMaxRows, 1 To 10)
    j = 0
    
    ' ========================================
    ' メイン処理(再帰的探索)
    ' ========================================
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Application.Calculation = xlCalculationManual
    
    Debug.Print "データ収集開始: " & Now
    
    Call ProcessFolderRecursive(myFSO, myFolder, myResult, j, myMaxRows)
    
    Debug.Print "ファイル処理完了: " & j & "行"
    
    ' ========================================
    ' 結果の一括書き込み
    ' ========================================
    If j > 0 Then
        Debug.Print "シートへの書き込み開始..."
        myWsDst.Range("A1").Resize(j, 10).Value = myResult
        Debug.Print "書き込み完了"
    End If
    
    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
    
    MsgBox "データ収集が完了しました。" & vbCrLf & _
           "総行数: " & Format(j, "#,##0") & "行" & vbCrLf & _
           "処理時間: " & Format(Timer - myStartTime, "0.0") & "秒", vbInformation
End Sub


Private Sub ProcessFolderRecursive(ByVal myFSO As Object, _
                                    ByVal myFolder As Object, _
                                    ByRef myResult() As Variant, _
                                    ByRef j As Long, _
                                    ByRef myMaxRows As Long)
    Dim myFile      As Object
    Dim mySubFolder As Object
    Dim myWB        As Workbook
    Dim mySh        As Worksheet
    Dim myRng       As Range
    Dim myVar       As Variant
    Dim i           As Long
    Dim k           As Long
    
    ' ========================================
    ' 現在のフォルダ内のファイルを処理
    ' ========================================
    For Each myFile In myFolder.Files
        With myFile
            If Right$(.Name, 5) = ".xlsx" Then
                
                Debug.Print "処理中 (" & j & "行): " & .Name
                
                On Error Resume Next
                Set myWB = Workbooks.Open(.Path, ReadOnly:=True, UpdateLinks:=False)
                On Error GoTo 0
                
                If Not myWB Is Nothing Then
                    On Error Resume Next
                    Set mySh = myWB.Worksheets("Sheet1")
                    On Error GoTo 0
                    
                    If Not mySh Is Nothing Then
                        ' A1からのCurrentRegionを取得
                        Set myRng = mySh.Range("A1").CurrentRegion
                        
                        ' A列~J列に制限
                        If myRng.Columns.Count >= 10 Then
                            Set myRng = Intersect(myRng, mySh.Range("A:J"))
                        End If
                        
                        ' データが存在するかチェック
                        If myRng.Rows.Count > 0 Then
                            myVar = myRng.Value
                            
                            ' 1次元配列の場合(1行のみ)の対応
                            If IsArray(myVar) Then
                                If UBound(myVar, 1) = 1 And UBound(myVar, 2) >= 1 Then
                                    ' 1行のみの場合
                                    j = j + 1
                                    
                                    If j > myMaxRows Then
                                        myMaxRows = myMaxRows + 100000
                                        ReDim Preserve myResult(1 To myMaxRows, 1 To 10)
                                    End If
                                    
                                    For k = 1 To UBound(myVar, 2)
                                        If k <= 10 Then
                                            myResult(j, k) = myVar(1, k)
                                        End If
                                    Next k
                                Else
                                    ' 複数行の場合
                                    For i = LBound(myVar, 1) To UBound(myVar, 1)
                                        j = j + 1
                                        
                                        ' 配列サイズを超えたら拡張
                                        If j > myMaxRows Then
                                            myMaxRows = myMaxRows + 100000
                                            ReDim Preserve myResult(1 To myMaxRows, 1 To 10)
                                        End If
                                        
                                        ' A列~J列のデータをコピー
                                        For k = LBound(myVar, 2) To UBound(myVar, 2)
                                            If k <= 10 Then
                                                myResult(j, k) = myVar(i, k)
                                            End If
                                        Next k
                                    Next i
                                End If
                            End If
                        End If
                    End If
                    
                    myWB.Close SaveChanges:=False
                    Set myWB = Nothing
                End If
            End If
        End With
    Next myFile
    
    ' ========================================
    ' サブフォルダを再帰的に処理
    ' ========================================
    For Each mySubFolder In myFolder.SubFolders
        Call ProcessFolderRecursive(myFSO, mySubFolder, myResult, j, myMaxRows)
    Next mySubFolder
End Sub

 医科作業用ファイル,歯科作業用ファイルとしてそれぞれ保存する.次に新規ブックを作成し,両者のデータを統合する.下記コードを標準モジュールに記述し実行する.

Sub MergeExistingFiles_Safe()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWB1       As Workbook
    Dim myWB2       As Workbook
    Dim mySh1       As Worksheet
    Dim mySh2       As Worksheet
    Dim myWsDst     As Worksheet
    Dim myRng1      As Range
    Dim myRng2      As Range
    Dim myPath1     As Variant
    Dim myPath2     As Variant
    Dim myStartTime As Double
    Dim myLastRow1  As Long
    Dim myLastRow2  As Long
    
    ' ========================================
    ' 初期化
    ' ========================================
    myStartTime = Timer
    
    myPath1 = Application.GetOpenFilename("Excel Files, *.xls?", , "医科データファイルを選択")
    If myPath1 = False Then Exit Sub
    
    myPath2 = Application.GetOpenFilename("Excel Files, *.xls?", , "歯科データファイルを選択")
    If myPath2 = False Then Exit Sub
    
    Set myWsDst = ThisWorkbook.Worksheets("Sheet1")
    myWsDst.Cells.Clear
    
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Application.Calculation = xlCalculationManual
    
    ' ========================================
    ' 1つ目のファイルを処理(医科)
    ' ========================================
    Debug.Print "医科データ読み込み中..."
    
    Set myWB1 = Workbooks.Open(myPath1, ReadOnly:=True, UpdateLinks:=False)
    Set mySh1 = myWB1.Worksheets("Sheet1")
    
    ' 最終行を取得
    myLastRow1 = mySh1.Cells(mySh1.Rows.Count, "A").End(xlUp).Row
    
    If myLastRow1 > 0 Then
        ' A列~J列の全データを取得
        Set myRng1 = mySh1.Range("A1:J" & myLastRow1)
        
        Debug.Print "医科データ: " & Format(myLastRow1, "#,##0") & "行"
        Debug.Print "範囲: " & myRng1.Address
        
        ' 直接コピー
        myRng1.Copy Destination:=myWsDst.Range("A1")
    End If
    
    myWB1.Close SaveChanges:=False
    Set myWB1 = Nothing
    
    ' ========================================
    ' 2つ目のファイルを処理(歯科)
    ' ========================================
    Debug.Print "歯科データ読み込み中..."
    
    Set myWB2 = Workbooks.Open(myPath2, ReadOnly:=True, UpdateLinks:=False)
    Set mySh2 = myWB2.Worksheets("Sheet1")
    
    ' 最終行を取得
    myLastRow2 = mySh2.Cells(mySh2.Rows.Count, "A").End(xlUp).Row
    
    If myLastRow2 > 0 Then
        ' A列~J列の全データを取得
        Set myRng2 = mySh2.Range("A1:J" & myLastRow2)
        
        Debug.Print "歯科データ: " & Format(myLastRow2, "#,##0") & "行"
        Debug.Print "範囲: " & myRng2.Address
        
        ' 医科データの下に追加
        myRng2.Copy Destination:=myWsDst.Range("A" & (myLastRow1 + 1))
    End If
    
    myWB2.Close SaveChanges:=False
    Set myWB2 = Nothing
    
    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
    
    Debug.Print "完了"
    
    MsgBox "データ統合が完了しました。" & vbCrLf & _
           "医科: " & Format(myLastRow1, "#,##0") & "行" & vbCrLf & _
           "歯科: " & Format(myLastRow2, "#,##0") & "行" & vbCrLf & _
           "合計: " & Format(myLastRow1 + myLastRow2, "#,##0") & "行" & vbCrLf & _
           "処理時間: " & Format(Timer - myStartTime, "0.0") & "秒", vbInformation
End Sub

 この時点でまだ第一正規形にはなっておらず,単に全データを一枚のワークシートにコピペしただけである.ここから医療機関番号,医療機関名,診療科名のリスト,郵便番号,住所を取り出す.A列の連番が前の医療機関と次の医療機関の境界となっている.I列に診療科名のリストがあるが,取得すべきでない文字列もI列にあり,それらを弾く仕組みの構築に苦労した.結果的にI列の最終行にあるセルが診療科名のリストであるという構造を適用した結果取得に成功した.

Sub ExtractDepartments_Redesign()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim i           As Long
    Dim myStartRow  As Long
    Dim myNextStart As Long
    Dim myOutRow    As Long
    Dim myCode      As String
    Dim myName      As String
    Dim myRawAddr   As String
    Dim myZip7      As String
    Dim myAddr      As String
    Dim myDept      As String
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("Sheet1")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("Sheet2")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "Sheet2"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科名"
        .Cells(1, 4).Value = "M_ZIPCODE"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    i = 1
    
    Do While i <= myLastRow
        If IsMedicalStartRow(myWsSrc, i) Then
            myStartRow = i
            myNextStart = FindNextStart(myWsSrc, i + 1, myLastRow)
            
            If myNextStart = 0 Then
                myNextStart = myLastRow + 1
            End If
            
            ' データの取得
            With myWsSrc
                myCode = Trim(.Cells(myStartRow, 2).Value & "")
                myName = Trim(.Cells(myStartRow, 3).Value & "")
                myRawAddr = Trim(.Cells(myStartRow, 4).Value & "")
            End With
            
            Call ExtractZipAndAddress(myRawAddr, myZip7, myAddr)
            myDept = FindDeptFromBottom(myWsSrc, myStartRow, myNextStart - 1)
            
            ' データの出力
            With myWsDst
                .Cells(myOutRow, 1).Value = myCode
                .Cells(myOutRow, 2).Value = myName
                .Cells(myOutRow, 3).Value = myDept
                .Cells(myOutRow, 4).NumberFormat = "@"
                .Cells(myOutRow, 4).Value = myZip7
                .Cells(myOutRow, 5).Value = myAddr
            End With
            
            myOutRow = myOutRow + 1
            i = myNextStart
        Else
            i = i + 1
        End If
    Loop
    
    MsgBox "診療科+郵便番号+住所の抽出が完了しました(Sheet2)"
End Sub


Private Function FindNextStart(ByVal myWs As Worksheet, _
                                ByVal myStartRow As Long, _
                                ByVal myLastRow As Long) As Long
    Dim i As Long
    
    For i = myStartRow To myLastRow
        If IsMedicalStartRow(myWs, i) Then
            FindNextStart = i
            Exit Function
        End If
    Next i
    
    FindNextStart = 0
End Function


Private Function FindDeptFromBottom(ByVal myWs As Worksheet, _
                                     ByVal myStartRow As Long, _
                                     ByVal myEndRow As Long) As String
    Dim i       As Long
    Dim myCand  As String
    
    For i = myEndRow To myStartRow Step -1
        myCand = Trim(myWs.Cells(i, 9).Value & "")
        
        If myCand <> "" Then
            If Not IsNoiseDept2(myCand) Then
                FindDeptFromBottom = myCand
                Exit Function
            End If
        End If
    Next i
    
    FindDeptFromBottom = ""
End Function


Private Function IsMedicalStartRow(ByVal myWs As Worksheet, _
                                    ByVal myRow As Long) As Boolean
    Dim myA As String
    Dim myC As String
    
    myA = myWs.Cells(myRow, 1).Value & ""
    myC = Trim(myWs.Cells(myRow, 3).Value & "")
    
    If myC = "" Then
        IsMedicalStartRow = False
        Exit Function
    End If
    
    myA = Replace(myA, " ", " ")
    myA = Trim(myA)
    myA = ZenkakuToHankaku(myA)
    
    If Left$(myA, 1) = "(" Then
        IsMedicalStartRow = False
        Exit Function
    End If
    
    If myA <> "" And IsNumeric(myA) Then
        IsMedicalStartRow = True
    Else
        IsMedicalStartRow = False
    End If
End Function


Private Function IsNoiseDept2(ByVal myTxt As String) As Boolean
    Dim myT     As String
    Dim myRest  As String
    Dim myInner As String
    Dim myKeys  As Variant
    Dim myKey   As Variant
    
    myT = Trim(Replace(myTxt, " ", " "))
    
    If IsNumeric(myT) Then
        IsNoiseDept2 = True
        Exit Function
    End If
    
    If Left$(myT, 1) = "(" And Right$(myT, 1) = ")" Then
        myInner = Mid$(myT, 2, Len(myT) - 2)
        myInner = ZenkakuToHankaku(myInner)
        
        If IsNumeric(myInner) Then
            IsNoiseDept2 = True
            Exit Function
        End If
    End If
    
    If myT = "一般" Or myT = "療養" Then
        IsNoiseDept2 = True
        Exit Function
    End If
    
    If Left$(myT, 2) = "一般" And InStr(myT, "(") > 0 Then
        IsNoiseDept2 = True
        Exit Function
    End If
    
    myKeys = Array("一般", "精神", "感染", "介護", "療養")
    For Each myKey In myKeys
        If Left$(myT, Len(myKey)) = myKey Then
            myRest = Trim(Mid$(myT, Len(myKey) + 1))
            myRest = ZenkakuToHankaku(myRest)
            
            If myRest <> "" And IsNumeric(myRest) Then
                IsNoiseDept2 = True
                Exit Function
            End If
        End If
    Next myKey
    
    If Left$(myT, 2) = "その他" Then
        IsNoiseDept2 = True
        Exit Function
    End If
    
    IsNoiseDept2 = False
End Function


Private Function ZenkakuToHankaku(ByVal myTxt As String) As String
    Dim i           As Long
    Dim myCh        As String
    Dim myCode      As Long
    Dim myResult    As String
    
    myResult = ""
    
    For i = 1 To Len(myTxt)
        myCh = Mid$(myTxt, i, 1)
        myCode = AscW(myCh)
        
        If myCode >= &HFF10 And myCode <= &HFF19 Then
            myResult = myResult & ChrW(myCode - &HFF10 + AscW("0"))
        Else
            myResult = myResult & myCh
        End If
    Next i
    
    ZenkakuToHankaku = myResult
End Function


Private Sub ExtractZipAndAddress(ByVal myRaw As String, _
                                  ByRef myZip7 As String, _
                                  ByRef myAddr As String)
    Dim myP         As Long
    Dim i           As Long
    Dim myCh        As String
    Dim myZipPart   As String
    
    myZip7 = ""
    myAddr = myRaw
    
    myP = InStr(myRaw, "〒")
    If myP = 0 Then Exit Sub
    
    myZipPart = ""
    For i = myP + 1 To Len(myRaw)
        myCh = Mid$(myRaw, i, 1)
        
        If (myCh >= "0" And myCh <= "9") _
            Or (AscW(myCh) >= &HFF10 And AscW(myCh) <= &HFF19) _
            Or myCh = "-" _
            Or myCh = "‐" _
            Or myCh = "-" _
            Or myCh = "―" Then
            myZipPart = myZipPart & myCh
        Else
            Exit For
        End If
    Next i
    
    If myZipPart = "" Then Exit Sub
    
    myZipPart = ZenkakuToHankaku(myZipPart)
    myZipPart = Replace(myZipPart, "‐", "-")
    myZipPart = Replace(myZipPart, "-", "-")
    myZipPart = Replace(myZipPart, "―", "-")
    
    myZip7 = Replace(myZipPart, "-", "")
    
    myAddr = ""
    If i <= Len(myRaw) Then
        myAddr = Trim(Mid$(myRaw, i))
    End If
End Sub

 その結果第一正規形の手前の状態となる.全角スペースが区切り文字となった配列型とみなせなくもない.その他にも丸括弧の中が配列型で区切り文字が全角カンマとなっているもの,単に全角カンマが区切り文字となっているものなどがある.全角中黒は区切り文字ではなく,おそらく一人の医師が複数可を標榜しているものと思われるため,そのままとしておく.

 以上の処理を順次実行する.取り出すデータは医療機関番号,医療機関名,診療科名,郵便番号,住所である.

' ========================================
' ドライバー: すべての処理を順番に実行
' ========================================
Sub ExecuteAllNormalization()
    Dim myStartTime As Double
    Dim myMsg       As String
    
    myStartTime = Timer
    
    ' 実行確認
    myMsg = "以下の処理を順番に実行します:" & vbCrLf & vbCrLf & _
            "① 全角スペースで分割(normalized シート)" & vbCrLf & _
            "② 括弧内の展開(Sheet2)" & vbCrLf & _
            "③ 読点区切りの展開(Sheet3)" & vbCrLf & _
            "④ 丸付き数字での展開(Sheet4)" & vbCrLf & _
            "⑤ カンマ区切りの展開(Sheet5)" & vbCrLf & vbCrLf & _
            "実行しますか?"
    
    If MsgBox(myMsg, vbYesNo + vbQuestion, "診療科正規化処理") = vbNo Then
        Exit Sub
    End If
    
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    Application.Calculation = xlCalculationManual
    
    Debug.Print "========================================="
    Debug.Print "診療科正規化処理開始: " & Now
    Debug.Print "========================================="
    
    ' ① 全角スペースで分割
    Debug.Print "① 全角スペースで分割処理中..."
    Call NormalizeDepartments
    Debug.Print "   完了"
    
    ' ② 括弧内の展開
    Debug.Print "② 括弧内の展開処理中..."
    Call ExpandDepartments
    Debug.Print "   完了"
    
    ' ③ 読点区切りの展開
    Debug.Print "③ 読点区切りの展開処理中..."
    Call SplitByComma
    Debug.Print "   完了"
    
    ' ④ 丸付き数字での展開
    Debug.Print "④ 丸付き数字での展開処理中..."
    Call SplitByCircledNumbers
    Debug.Print "   完了"
    
    ' ⑤ カンマ区切りの展開
    Debug.Print "⑤ カンマ区切りの展開処理中..."
    Call SplitByCommaOnly
    Debug.Print "   完了"
    
    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = True
    Application.DisplayAlerts = True
    
    Debug.Print "========================================="
    Debug.Print "すべての処理が完了しました"
    Debug.Print "処理時間: " & Format(Timer - myStartTime, "0.0") & "秒"
    Debug.Print "========================================="
    
    MsgBox "すべての診療科正規化処理が完了しました。" & vbCrLf & vbCrLf & _
           "最終結果: Sheet6" & vbCrLf & _
           "処理時間: " & Format(Timer - myStartTime, "0.0") & "秒", _
           vbInformation, "処理完了"
End Sub


' ========================================
' ① NormalizeDepartments
' ========================================
Sub NormalizeDepartments()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim myOutRow    As Long
    Dim i           As Long
    Dim myRaw       As String
    Dim myParts     As Variant
    Dim myP         As Variant
    Dim myNorm      As String
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("Sheet2")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("normalized")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "normalized"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科"
        .Cells(1, 4).Value = "郵便番号"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    
    For i = 2 To myLastRow
        myRaw = Trim(myWsSrc.Cells(i, 3).Value & "")
        
        If myRaw <> "" Then
            myParts = Split(myRaw, " ")
            
            For Each myP In myParts
                myP = Trim(myP & "")
                
                If myP <> "" Then
                    myNorm = NormalizeDeptName(myP)
                    
                    With myWsDst
                        .Cells(myOutRow, 1).Value = myWsSrc.Cells(i, 1).Value
                        .Cells(myOutRow, 2).Value = myWsSrc.Cells(i, 2).Value
                        .Cells(myOutRow, 3).Value = myNorm
                        .Cells(myOutRow, 4).Value = myWsSrc.Cells(i, 4).Value
                        .Cells(myOutRow, 5).Value = myWsSrc.Cells(i, 5).Value
                    End With
                    
                    myOutRow = myOutRow + 1
                End If
            Next myP
        End If
    Next i
End Sub


Private Function NormalizeDeptName(ByVal myS As String) As String
    NormalizeDeptName = Trim(myS)
End Function


' ========================================
' ② ExpandDepartments
' ========================================
Sub ExpandDepartments()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim myOutRow    As Long
    Dim i           As Long
    Dim myRaw       As String
    Dim myExpanded  As Variant
    Dim myE         As Variant
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("normalized")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("Sheet3")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "Sheet3"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科"
        .Cells(1, 4).Value = "郵便番号"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    
    For i = 2 To myLastRow
        myRaw = Trim(myWsSrc.Cells(i, 3).Value & "")
        
        If myRaw <> "" Then
            myExpanded = ExpandDept(myRaw)
            
            For Each myE In myExpanded
                With myWsDst
                    .Cells(myOutRow, 1).Value = myWsSrc.Cells(i, 1).Value
                    .Cells(myOutRow, 2).Value = myWsSrc.Cells(i, 2).Value
                    .Cells(myOutRow, 3).Value = myE
                    .Cells(myOutRow, 4).Value = myWsSrc.Cells(i, 4).Value
                    .Cells(myOutRow, 5).Value = myWsSrc.Cells(i, 5).Value
                End With
                
                myOutRow = myOutRow + 1
            Next myE
        End If
    Next i
End Sub


Private Function ExpandDept(ByVal myS As String) As Variant
    Dim myResult()  As String
    Dim myInner     As String
    Dim myOuter     As String
    Dim myP1        As Long
    Dim myP2        As Long
    Dim myParts     As Variant
    Dim i           As Long
    Dim myHasComma  As Boolean
    
    myP1 = InStr(myS, "(")
    myP2 = InStr(myS, ")")
    
    If myP1 = 0 Or myP2 = 0 Or myP2 < myP1 Then
        myP1 = InStr(myS, "(")
        myP2 = InStr(myS, ")")
    End If
    
    If myP1 = 0 Or myP2 = 0 Or myP2 < myP1 Then
        ReDim myResult(0)
        myResult(0) = myS
        ExpandDept = myResult
        Exit Function
    End If
    
    myInner = Mid$(myS, myP1 + 1, myP2 - myP1 - 1)
    myOuter = Trim(Mid$(myS, myP2 + 1))
    
    If myOuter = "" Then
        ReDim myResult(0)
        myResult(0) = myS
        ExpandDept = myResult
        Exit Function
    End If
    
    myHasComma = (InStr(myInner, "、") > 0) Or (InStr(myInner, ",") > 0)
    
    If myHasComma Then
        If InStr(myInner, "、") > 0 Then
            myParts = Split(myInner, "、")
        Else
            myParts = Split(myInner, ",")
        End If
        
        ReDim myResult(UBound(myParts))
        
        For i = 0 To UBound(myParts)
            myResult(i) = Trim(myParts(i)) & myOuter
        Next i
    Else
        ReDim myResult(0)
        myResult(0) = myS
    End If
    
    ExpandDept = myResult
End Function


' ========================================
' ③ SplitByComma
' ========================================
Sub SplitByComma()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim myOutRow    As Long
    Dim i           As Long
    Dim myRaw       As String
    Dim myParts     As Variant
    Dim myP         As Variant
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("Sheet3")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("Sheet4")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "Sheet4"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科"
        .Cells(1, 4).Value = "郵便番号"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    
    For i = 2 To myLastRow
        myRaw = Trim(myWsSrc.Cells(i, 3).Value & "")
        
        If myRaw <> "" Then
            If InStr(myRaw, "、") > 0 Then
                myParts = Split(myRaw, "、")
            Else
                ReDim myParts(0)
                myParts(0) = myRaw
            End If
            
            For Each myP In myParts
                myP = Trim(myP & "")
                
                If myP <> "" Then
                    With myWsDst
                        .Cells(myOutRow, 1).Value = myWsSrc.Cells(i, 1).Value
                        .Cells(myOutRow, 2).Value = myWsSrc.Cells(i, 2).Value
                        .Cells(myOutRow, 3).Value = myP
                        .Cells(myOutRow, 4).Value = myWsSrc.Cells(i, 4).Value
                        .Cells(myOutRow, 5).Value = myWsSrc.Cells(i, 5).Value
                    End With
                    
                    myOutRow = myOutRow + 1
                End If
            Next myP
        End If
    Next i
End Sub


' ========================================
' ④ SplitByCircledNumbers
' ========================================
Sub SplitByCircledNumbers()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim myOutRow    As Long
    Dim i           As Long
    Dim myRaw       As String
    Dim myParts     As Variant
    Dim myP         As Variant
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("Sheet4")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("Sheet5")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "Sheet5"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科"
        .Cells(1, 4).Value = "郵便番号"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    
    For i = 2 To myLastRow
        myRaw = Trim(myWsSrc.Cells(i, 3).Value & "")
        
        If myRaw <> "" Then
            myParts = SplitCircled(myRaw)
            
            For Each myP In myParts
                myP = Trim(myP & "")
                
                If myP <> "" Then
                    With myWsDst
                        .Cells(myOutRow, 1).Value = myWsSrc.Cells(i, 1).Value
                        .Cells(myOutRow, 2).Value = myWsSrc.Cells(i, 2).Value
                        .Cells(myOutRow, 3).Value = myP
                        .Cells(myOutRow, 4).Value = myWsSrc.Cells(i, 4).Value
                        .Cells(myOutRow, 5).Value = myWsSrc.Cells(i, 5).Value
                    End With
                    
                    myOutRow = myOutRow + 1
                End If
            Next myP
        End If
    Next i
End Sub


Private Function SplitCircled(ByVal myS As String) As Variant
    Dim myResult()  As String
    Dim myBuf       As String
    Dim i           As Long
    Dim myCh        As String
    Dim myCount     As Long
    
    ReDim myResult(0)
    myBuf = ""
    myCount = 0
    
    For i = 1 To Len(myS)
        myCh = Mid$(myS, i, 1)
        
        If AscW(myCh) >= &H2460 And AscW(myCh) <= &H2473 Then
            If Trim(myBuf) <> "" Then
                myResult(myCount) = Trim(myBuf)
                myCount = myCount + 1
                ReDim Preserve myResult(myCount)
            End If
            
            myBuf = ""
        Else
            myBuf = myBuf & myCh
        End If
    Next i
    
    If Trim(myBuf) <> "" Then
        myResult(myCount) = Trim(myBuf)
    Else
        If myCount = 0 Then
            myResult(0) = myS
        End If
    End If
    
    SplitCircled = myResult
End Function


' ========================================
' ⑤ SplitByCommaOnly
' ========================================
Sub SplitByCommaOnly()
    ' ========================================
    ' 変数宣言
    ' ========================================
    Dim myWsSrc     As Worksheet
    Dim myWsDst     As Worksheet
    Dim myLastRow   As Long
    Dim myOutRow    As Long
    Dim i           As Long
    Dim myRaw       As String
    Dim myParts     As Variant
    Dim myP         As Variant
    
    ' ========================================
    ' ワークシートオブジェクトの取得
    ' ========================================
    Set myWsSrc = ThisWorkbook.Sheets("Sheet5")
    
    On Error Resume Next
    Set myWsDst = ThisWorkbook.Sheets("Sheet6")
    On Error GoTo 0
    
    If myWsDst Is Nothing Then
        Set myWsDst = Worksheets.Add(After:=myWsSrc)
        myWsDst.Name = "Sheet6"
    Else
        myWsDst.Cells.Clear
    End If
    
    ' ========================================
    ' ヘッダー行の設定
    ' ========================================
    With myWsDst
        .Cells(1, 1).Value = "医療機関番号"
        .Cells(1, 2).Value = "医療機関名"
        .Cells(1, 3).Value = "診療科"
        .Cells(1, 4).Value = "郵便番号"
        .Cells(1, 5).Value = "住所"
    End With
    
    ' ========================================
    ' メイン処理
    ' ========================================
    myLastRow = myWsSrc.Cells(myWsSrc.Rows.Count, "A").End(xlUp).Row
    myOutRow = 2
    
    For i = 2 To myLastRow
        myRaw = Trim(myWsSrc.Cells(i, 3).Value & "")
        
        If myRaw <> "" Then
            myRaw = Replace(myRaw, ",", "、")
            myRaw = Replace(myRaw, ",", "、")
            
            If InStr(myRaw, "、") > 0 Then
                myParts = Split(myRaw, "、")
            Else
                ReDim myParts(0)
                myParts(0) = myRaw
            End If
            
            For Each myP In myParts
                myP = Trim(myP & "")
                
                If myP <> "" Then
                    With myWsDst
                        .Cells(myOutRow, 1).Value = myWsSrc.Cells(i, 1).Value
                        .Cells(myOutRow, 2).Value = myWsSrc.Cells(i, 2).Value
                        .Cells(myOutRow, 3).Value = myP
                        .Cells(myOutRow, 4).Value = myWsSrc.Cells(i, 4).Value
                        .Cells(myOutRow, 5).Value = myWsSrc.Cells(i, 5).Value
                    End With
                    
                    myOutRow = myOutRow + 1
                End If
            Next myP
        End If
    Next i
End Sub

 以上の処理により第一正規形となった.

まとめ

 全国の地方厚生局のサイトにある各医療機関の診療科名を取得し,第一正規形にした.Power Queryだけでは対応しきれず,VBAによる処理が必要であった.最終的な件数は484,435行であった.医療機関数は156,045行であった.データは2026年1月時点のものである.

 生成AIをコーディングに使用した.入力と出力を1対1で対応させ,1ステップずつ処理を行わせることで目的を達した.

コメントを残す

メールアドレスが公開されることはありません。 が付いている欄は必須項目です

このサイトはスパムを低減するために Akismet を使っています。コメントデータの処理方法の詳細はこちらをご覧ください