How to Extract Data from a Spreadsheet using VLOOKUP, MATCH and INDEX
Using VLOOKUP
When VLOOKUP finds the identifier that you specify in the source data, it can then find any cell in that row and return the information to you. Note that in the source data, the identifier must be in the first column of the table.
Syntax
The syntax of the VLOOKUP function is:
=VLOOKUP(lookup value, table range, column number, [true/false])
Here’s what these arguments mean:
Lookup value. The cell that has the unique identifier.
Table range. The range of cells that has the identifier in the first column, followed by the rest of the data in the other columns.
Column number. The number of the column that has the data you’re looking for. Don’t get that confused with the column’s letter. In the above illustration, the states are in column 4.
True/False. This argument is optional. True means that an approximate match is acceptable, and False means that only an exact match is acceptable.
We want to find sales amounts from the table in the illustration above, so we use these arguments:
Define a Range Name to Create an Absolute Reference
In Vlookup example.xlsx, look at the Sales Amounts worksheet. We’ll enter the formula in B5, then use the AutoFill feature to copy the formula down the sheet. That means the table range in the formula has to be an absolute reference. A good way to do that is to define a name for the table range.
Defining a Range Name in Excel
Before entering the formula, go to the source data worksheet.
Select all the cells from A4 (header for the Order # column) down through H203. A quick way of doing it is to click A4, then press Ctrl-Shift-End (Command-Shift-End on the Mac).
Click inside the Name Box above column A (the Name Box now displays A4).
Type data, then press Enter.
You can now use the name data in the formula instead of $A$4:$H$203
Defining a Range name in Google Sheets
In Google Sheets, defining a name is a little different.
Click the first column header of your source data, then press Ctrl-Shift-Right Arrow (Command-Shift-Right Arrow on the Mac). That selects the row of column headers.
Press Ctrl-Shift-Down Arrow (Command-Shift-Down Arrow on the Mac). That selects the actual data.
Click the Data menu, then select Named and protected ranges.
In the Name and protected ranges box on the right, type data, then click Done.
Entering the Formula
To enter the formula, go to the Sales Amounts worksheet and click in B5.
Enter the formula:
=VLOOKUP(A5,data,8,FALSE)
Press Enter.
The result should be 40. To fill in the values down the column, click back on B5, if necessary. Put the mouse pointer on the AutoFill dot in the cell’s lower-right corner, so the mouse pointer becomes a cross hair.
Double-click to fill the values down the column.
If you want, you can run the VLOOKUP function in the next few columns to extract other fields, like last name or state.
Syntax
The syntax of the MATCH function is:
=MATCH(lookup value, table range, [match type])
The arguments are:
Lookup value. The cell that has the unique identifier.
Table range. The range of cells you’re searching.
Match type. Optional. It’s how you specify how close of a match you want, as follows:
Next highest value
-1
Values must be in descending order.
Target value
0
Values can be in any order.
Next lowest value
1
Default type. Values must be in ascending order.
As with the VLOOKUP function, you’ll probably find the MATCH function easier to use if you apply a range name. Go to the Source Data sheet, select from B4 (column header for order #) to the bottom, click in the Name box above column A, and call it order_number. Note that the values are in ascending order.
Go to the Match tab of the worksheet. In B5, enter the MATCH function:
If you didn’t define a range name, you’d write the function as:
=MATCH(A5,'Source Data'!A5:A203,0)
Either way, you can see that this is in the 14th position (making it the 13th order).
Syntax
The syntax of the INDEX function is:
=INDEX(data range, row number, [column number])
The arguments are:
Data range. Just like the other two functions, this is the table of data.
Row number. The row number of the data, which is not necessarily the row of the worksheet. If the table range starts on row 10 of the sheet, then that’s row #1.
Column number. The column number of the data range. If the range starts on column E, that’s column #1.
Excel’s documentation will tell you that the column number argument is optional, but the row number is sort of optional, too. If the table range has only one row or one column, you don’t have to use the other argument.
Go to the Index sheet of the workbook and click in C6. We first want to find what’s contained in row 9, column 3 of the table. In the formula, we’ll use the range name that we created earlier.
Enter the formula:
=INDEX(data,A6,B6)
It returns a customer’s last name: Strevell. Change the values of A6 and B6, and the result in C6 will show different results (note that many rows have the same states and product names).
Conclusion
The ability of a worksheet to look at another worksheet and extract data is a great tool. This way, you can have one sheet that contains all the data you need for many purposes, then extract what you need for specific instances